Skip to content

Flatten Variant

The Flatten Variant transformation component unpacks arrays of semi-structured data, for example in a JSON file, into structured data, in the form of columns and rows of data in a table. The component is especially useful when using a Custom Connector to access an API that returns data in JSON format.

Flatten Variant takes input from a table with a single column. This single column contains all of the data in an unpacked, raw form. Each element in the source semi-structured data will then be mapped to a different column in the target table. For example, consider the following JSON structure, containing an array of three elements:

{
    "name": "varchar",
    "png": "varchar",
    "alt": "varchar"
}

When data in this format is "flattened", the three array elements can be mapped to three columns in the target table. You use Flatten Variant's Column Mapping property to define which element in the source array will be mapped to which column in the target structured data. For example, to map the above structure you might complete the Column Mapping dialog as follows:

Column Property Type Alias
JSON-INPUT name VARCHAR Name
JSON-INPUT png VARCHAR Flag
JSON-INPUT alt VARCHAR Alt_Text

Note that JSON-INPUT in this example is the name of the column in our input that contains the raw JSON data.

When we sample the data from the transformation, we will see something like the following table:

Name Flag Alt_Text
Cyprus https://flagcdn.com/w320/cy.png The flag of Cyprus.
Somalia https://flagcdn.com/w320/so.png The flag of Somalia.
Venezuela https://flagcdn.com/w320/ve.png The flag of Venezuela.

If an element of the source data contains a nested array, the column it is mapped to will contain the entire nested array as a single string, requiring a further step to extract that array's elements into their own columns. Use the Column Flattens property for this.

Note

For an alternative method of extracting semi-structured data, you can use the Extract Nested Data component.


Properties

Name = string

A human-readable name for the component.


Include Input Columns = boolean

Choose whether to include input columns in the output table.


Column Mapping = column editor

Specify each item in the source array that will be mapped to a column in the target table. For each item, complete the following fields in the Column Mapping dialog.

  • Column: Select the column in the source table that contains the raw semi-structured data.
  • Property: The name of the element within the semi-structured data that you want to map to an output column.
  • Type: The data type of the output column you are mapping the element to.
  • Alias: The name of the output column you are mapping the element to.

If your source data contains nested arrays, use the Column Flattens property to further extract the contents of those arrays into additional columns before mapping them here.

You can use Grid Variables in this dialog if desired.


Column Flattens = column editor

If an element of the source data contains a nested array, the Column Flattens property will allow you to extract the nested array's elements into individual elements that can then be mapped to columns in the Column Mapping property. For each nested element you want to extract, complete the following fields in the Column Flattens dialog.

  • Column: Select the column in the source table that contains the raw semi-structured data.
  • Property: The name of the element within the nested array that you want to map to a column.
  • Alias: The name of the column you are mapping the element to. The alias you create here can then be selected in the Column drop-down of the Column Mapping dialog.
  • Outer: Determines how to handle input rows that can't be expanded (for example, because they have no fields to expand, or because they can't be accessed by the name you gave as the Property). Select No to completely omit these rows from the output, or Yes to generate an output row with NULL values.
  • Recursive: Select No if only the input element referenced by the Property field, is to be expanded. Select Yes for the expansion to be performed on all sub-elements recursively.
  • Mode: Select whether the expansion should be performed on only Object elements, only Array elements, or Both types of element.

You can extract elements from any level of array nesting by using standard JSON dot notation to address the element in the Property field. To address a nested field, you specify a path to it by listing the elements containing it in the JSON structure, separating each element with a dot. For example, consider this nested structure:

{
    "country": {
        "flags": {
            "alt": "varchar",
            "png": "varchar"
        },
    },
}

To address the alt element of this data, you need to enter country.flags.alt in the Property field. Dot notation allows you to reach any level of nesting in the structure.

You can use Grid Variables in this dialog if desired.


Snowflake Databricks Amazon Redshift