Flatten Variant
The Flatten Variant transformation component unpacks arrays of semi-structured data, for example in a JSON file, into structured data, in the form of columns and rows of data in a table. The component is especially useful when using a Custom Connector to access an API that returns data in JSON format.
Flatten Variant takes input from a table with a single column. This single column contains all of the data in an unpacked, raw form. Each element in the source semi-structured data will then be mapped to a different column in the target table. For example, consider the following JSON structure, containing an array of three elements:
{
"name": "varchar",
"png": "varchar",
"alt": "varchar"
}
When data in this format is "flattened", the three array elements can be mapped to three columns in the target table. You use Flatten Variant's Column Mapping property to define which element in the source array will be mapped to which column in the target structured data. For example, to map the above structure you might complete the Column Mapping dialog as follows:
Column | Property | Type | Alias |
---|---|---|---|
JSON-INPUT | name | VARCHAR | Name |
JSON-INPUT | png | VARCHAR | Flag |
JSON-INPUT | alt | VARCHAR | Alt_Text |
Note that JSON-INPUT
in this example is the name of the column in our input that contains the raw JSON data.
When we sample the data from the transformation, we will see something like the following table:
Name | Flag | Alt_Text |
---|---|---|
Cyprus | https://flagcdn.com/w320/cy.png | The flag of Cyprus. |
Somalia | https://flagcdn.com/w320/so.png | The flag of Somalia. |
Venezuela | https://flagcdn.com/w320/ve.png | The flag of Venezuela. |
If an element of the source data contains a nested array, the column it is mapped to will contain the entire nested array as a single string, requiring a further step to extract that array's elements into their own columns. Use the Column Flattens property for this.
Note
For an alternative method of extracting semi-structured data, you can use the Extract Nested Data component.
Properties
Name
= string
A human-readable name for the component.
Include Input Columns
= boolean
Choose whether to include input columns in the output table.
Column Mapping
= column editor
Specify each item in the source array that will be mapped to a column in the target table. For each item, complete the following fields in the Column Mapping dialog.
- Column: Select the column in the source table that contains the raw semi-structured data.
- Property: The name of the element within the semi-structured data that you want to map to an output column.
- Type: The data type of the output column you are mapping the element to.
- Alias: The name of the output column you are mapping the element to.
If your source data contains nested arrays, use the Column Flattens property to further extract the contents of those arrays into additional columns before mapping them here.
You can use Grid Variables in this dialog if desired.
Column Flattens
= column editor
If an element of the source data contains a nested array, the Column Flattens property will allow you to extract the nested array's elements into individual elements that can then be mapped to columns in the Column Mapping property. For each nested element you want to extract, complete the following fields in the Column Flattens dialog.
- Column: Select the column in the source table that contains the raw semi-structured data.
- Property: The name of the element within the nested array that you want to map to a column.
- Alias: The name of the column you are mapping the element to. The alias you create here can then be selected in the Column drop-down of the Column Mapping dialog.
- Outer: Determines how to handle input rows that can't be expanded (for example, because they have no fields to expand, or because they can't be accessed by the name you gave as the Property). Select No to completely omit these rows from the output, or Yes to generate an output row with
NULL
values. - Recursive: Select No if only the input element referenced by the Property field, is to be expanded. Select Yes for the expansion to be performed on all sub-elements recursively.
- Mode: Select whether the expansion should be performed on only Object elements, only Array elements, or Both types of element.
You can extract elements from any level of array nesting by using standard JSON dot notation to address the element in the Property field. To address a nested field, you specify a path to it by listing the elements containing it in the JSON structure, separating each element with a dot. For example, consider this nested structure:
{
"country": {
"flags": {
"alt": "varchar",
"png": "varchar"
},
},
}
To address the alt
element of this data, you need to enter country.flags.alt
in the Property field. Dot notation allows you to reach any level of nesting in the structure.
You can use Grid Variables in this dialog if desired.
Snowflake | Databricks | Amazon Redshift |
---|---|---|
✅ | ❌ | ❌ |