Skip to content

Extract Nested Data

The Extract Nested Data component flattens nested data into rows. This is generally done by taking nested data in the form of key:value pairs (such as a JSON dictionary) and using those keys as column names.

The flow into this component should include a single variant-type column that is to be unpacked.

In Amazon Redshift, the flow into this component should include a single, SUPER type column to be unpacked.

In Google BigQuery, when flattening data, the number of rows output will be the product of the number of items in every array field selected, per input row. If any are empty, no output rows will be returned.


Properties

Name = string

A human-readable name for the component.


Include Input Columns = drop-down

Choose whether to include input columns.


Columns = editor

Define the variant's structure here. Unique endpoints in this structure will be unpacked into columns. The variant from the input can usually be automatically detected, and its structure determined, using the Autofill button. Elements of the structure can be manually edited by right-clicking them on the tree and selecting "Delete" or "Edit" as appropriate.


Outer Join = drop-down

If False (the default), any input rows that can't be expanded, either because they can't be accessed in the path or because they have zero fields or entries, are completely omitted from the output. If True, a single row is generated for expansions with zero rows.


Input Alias = string

If two properties have identical names, one will be given this prefix to differentiate them. More than two identically named properties will result in an error. This does not need changing in the vast majority of use cases.


Array Prefix = string

If two array structures have identical names, one will be given this prefix to differentiate them. More than two identically named structures will result in an error. This does not need changing in the vast majority of use cases.


Casting Method = drop-down

Select a casting method. Options include Fail on invalid data (default), Replace all unparseable values with null, or Replace unparseable dates and timestamps with null.


Case Column Alias Names = drop-down

Set the casing for alias columns names. Settings include Lower, No, or Upper. The default is No.

Name = string

A human-readable name for the component.


Columns = drop-down

Select the columns to be written to the table.


Outer Join = drop-down

If False (the default), any input rows that can't be expanded, either because they can't be accessed in the path or because they have zero fields or entries, are completely omitted from the output. If True, a single row is generated for expansions with zero rows.


Column Aliases = column editor

  • Source Column: Select the source column that you wish to provide an alternative name (an alias) for.
  • Target Column: Provide an alternative name for the column.

This property is optional and both columns can be left empty if the user does not wish to provide alternative names for columns.


Include Input Columns = drop-down

Choose whether or not to include input columns in the data extraction.


Input Column Prefix = string

Give a prefix name for the input columns. The default is input_.

Name = string

A human-readable name for the component.


Columns = editor

Select the columns to be written to the table.


Column Alias = string

An alias for the input component reference (for example, Table Input or Fixed Flow). The default alias is i.


Include Input Columns = drop-down

Choose whether or not to include input columns. The default setting is No.


Array Prefix = string

A prefix for the input's array. This property will increment inputs with one-based indexing (for example, f1, f2, f3, ...) where differentiation is required. The default value is f.

Name = string

A human-readable name for the component.


Columns = drop-down

Select the columns to be written to the table.


Outer Join = drop-down

If False (the default), any input rows that can't be expanded, either because they can't be accessed in the path or because they have zero fields or entries, are completely omitted from the output. If True, a single row is generated for expansions with zero rows.


Column Aliases = string

Give an alternative name for the columns you select.


Include Input Columns = drop-down

Choose whether to include input columns.


Input Column Prefix = string

Give a prefix name for the input columns.


Snowflake Delta Lake on Databricks Amazon Redshift Google BigQuery Azure Synapse Analytics