Upgrade: API Extract🔗

The Data Productivity Cloud replaces Matillion ETL's API Extract component with custom connectors, flexible components that can be configured to connect to any compatible third-party API.

If you have a job that uses the API Extract component, the migrated pipeline will use a custom connector that replicates the functionality of your extract component. Most of the configuration from your existing API Extract profile is preserved during migration, but some settings require manual reconfiguration. This document outlines the steps needed, along with best practices and important considerations.

Note

If your job uses the API Query component, read API Query instead.

Upgrade path🔗

First, export your existing API Extract profiles from Matillion ETL and import them into the Data Productivity Cloud. This will create and configure an equivalent custom connector in the Data Productivity Cloud, preserving much of the configuration from your API Extract profile. You can then export the Matillion ETL job that uses the API Extract component and import it into the Data Productivity Cloud. The new pipeline will reference the custom connector previously created from your API Extract profile.

Exporting and importing extract profiles🔗

In Matillion ETL, click Project → Manage API Profiles → Manage Extract Profiles.
In the Manage Extract Profiles dialog, locate the profile you want to export and click the download icon next to it. This will download the profile as a .json file to your local filesystem (exact location will be browser and operating system dependent).
Navigate to the Data Productivity Cloud.
In the left navigation, click the Custom Connectors icon , then select Custom Connectors from the menu.
In the top-right of the Custom connectors list, click Import.
In the Import connector dialog, select the .json file that you exported your extract profile to, then click Import. This will create a new custom connector in the Data Productivity Cloud based on the imported profile.
On the custom connector details page, review the imported configuration and ensure that all endpoints, authentication, and other parameters are configured as required.
- The new custom connector will have the same name as the API Extract profile, and you should keep this name to ensure that the pipeline correctly references the connector.
- Pagination isn't imported and must be configured from scratch in the Data Productivity Cloud. Its strategies follow the same concepts.
  
  The Data Productivity Cloud offers a script pagination option. For ease of migration, consider using this method—if all endpoints share the same pagination logic, you can write the pagination script once and reuse it across all endpoints, reducing duplication.
- Create a new authentication configuration in the Data Productivity Cloud and assign it to each endpoint manually. The authentication type is inferred from the imported API Extract profile.
- Parameters are inferred from the imported profile and typically require no manual reconfiguration. Headers are also preserved during import. The request body, if configured in Matillion ETL, will be retained in the imported profile.
Click Save. You must do this even if you have made no manual changes to the configuration, to save the imported configuration.

Configuring the custom connector in the pipeline🔗

Export the Matillion ETL job that contains the API Extract component and import it into the Data Productivity Cloud, as described in Import your jobs into the Data Productivity Cloud.
The imported pipeline will contain a custom connector component in place of the API Extract component. Most of the component properties will be pre-configured based on the imported profile, but some manual configuration is required:
- Configure the component's authentication because Matillion doesn't import your secrets for security reasons.
- Properties that didn't exist in Matillion ETL are assigned default values so that the Custom Connector component replicates the behavior of the API Extract wherever possible. Review these values and adjust them as needed.
- Ensure that any parameters that refer to specific services used by the Matillion ETL job are updated if your Data Productivity Cloud uses different services. For example, if your Matillion ETL job uses an S3 bucket to stage data but your Data Productivity Cloud pipeline uses Azure Blob Storage, update the relevant parameters accordingly.
- Note that the output schema may differ from that in Matillion ETL, for example table columns may have different names. Review the output and adjust downstream components if necessary.

Platform-specific information🔗

For Snowflake and Amazon Redshift data warehouses:

If Load Selected Data is set to No, and a repeat element is configured, the raw JSON is staged into a VARIANT column named Data Value in Snowflake, or a SUPER column named data in Redshift.
If you don't configure a repeat element, the system loads data according to the schema definition. It automatically unnests the first level of nesting, which may make the data appear flattened.

To ensure compatibility in Matillion ETL so that your migrated transformation pipelines can use the data as-is, set Load Selected Data to No.

For Databricks warehouses:

Data is always loaded according to the defined schema.
There is no implicit flattening or nesting manipulation.

Best practices🔗

Start migration with non-authenticated or non-paginated API Extract pipelines, which import cleanly.
Use the preview feature for custom connectors to validate your configuration before creating your pipelines.
Review your schema settings and adjust data type mappings if discrepancies occur.

Comparing API Extract and custom connectors🔗

Feature	API Extract	Custom Connector	Notes
Profiles	Configured in Matillion ETL	Imported or created in custom connector
Pagination	Configured for each endpoint	Configured for each endpoint, supports script pagination	Not imported during migration and must be configured again
Headers/Parameters	Manually defined	Manually defined or inferred from imported profile
Authentication	Configured in API Extract component	Defined once and assigned to each endpoint	Not imported during migration and must be configured again
Flattening/Schema	Basic JSON flattening	Depends on repeat element configuration in Matillion ETL
Load Selected Data	Implicit from schema	Must be configured in the Data Productivity Cloud	Set to FALSE to match behavior in Matillion ETL, or set to TRUE as a refactor
Semi-structured handling	Component-specific	Warehouse-specific