Migration: Extract Nested Data
In Matillion ETL for Databricks, the Extract Nested Data component uses the Struct data type to flatten nested data.
In the Data Productivity Cloud, the Extract Nested Data component for Databricks instead uses the Variant data type. This results in different behavior compared to Matillion ETL.
To preserve the original behavior for Databricks projects, the Data Productivity Cloud provides a separate component—Extract Structured Data—which continues to use the Struct data type.
Migration mapping
During migration of Databricks projects from Matillion ETL to the Data Productivity Cloud:
- The Extract Nested Data component from Matillion ETL will be mapped to the Extract Structured Data component in the Data Productivity Cloud.
- This ensures equivalent functionality and preserves pipeline behavior.
Note
- While the Data Productivity Cloud includes an Extract Nested Data component for Databricks, it isn't used during migration due to its reliance on the Variant type.
- The mapping applies only to Databricks projects. Snowflake and Amazon Redshift projects are unaffected and don't require any special component mapping.