Skip to content

API Query and API Extract

The Data Productivity Cloud replaces Matillion ETL's API Query and API Extract with custom connectors, flexible components that can be configured to connect to any compatible third-party API.

If you have a job that uses the API Query or API Extract component, the migrated pipeline should be refactored to use custom connectors that will replicate the functionality of your query or extract components.


API Query

To migrate the API Query component to the Data Productivity Cloud:

  1. Recreate the connector as a custom connector.
  2. Replace the "Unknown Component" in the migrated pipeline with the new custom connector.

API Extract

The overall goal of migrating the API Extract component to the Data Productivity Cloud is to minimize manual rebuilds of existing API Extract pipelines, so you don't need to recreate entire pipelines from scratch. Instead, first export your existing API Extract profiles, and then import them into custom connectors. You can then replace the API Extract component in your Matillion ETL pipeline with the corresponding custom connector component in the Data Productivity Cloud. You will need to repeat the export and import process for each API Extract profile that you want to use in the Data Productivity Cloud.

To migrate the API Extract component to the Data Productivity Cloud, you need to perform the following steps. Some steps involve taking actions, while others may only require you to check your configuration.

  1. In Matillion ETL, click ProjectManage API ProfilesManage Extract Profiles.
  2. In the Manage Extract Profiles dialog, locate the profile you want to export and click the download icon next to it. This will download the profile as a .json file to your local filesystem (exact location will be browser and operating system dependent).
  3. Navigate to the Data Productivity Cloud.
  4. In the left navigation, click the Custom Connectors icon . Then, select Custom Connectors from the menu.
  5. In the top-right of the Custom connectors list, click Import.
  6. Select the .json file that you exported your extract profile to, then click Import.

    Best practice

    Give the new custom connector the same name as the API Extract profile used in your Matillion ETL pipeline. This ensures that the pipeline correctly references the connector.

  7. Create a new authentication configuration in the Data Productivity Cloud and assign it to each endpoint manually. The authentication type is inferred from the imported API Extract profile.

  8. Parameters are inferred from the imported profile and typically require no manual reconfiguration. Headers are also preserved during import. The request body, if configured in Matillion ETL, will be retained in the imported profile.
  9. Pagination is never imported and must be configured from scratch in the Data Productivity Cloud. The pagination strategies are conceptually similar.

    Note

    The Data Productivity Cloud offers a script pagination option. For ease of migration, consider using this method—if all endpoints share the same pagination logic, you can write the pagination script once and reuse it across all endpoints, reducing duplication.

  10. In the left navigation, click the Projects icon .

  11. Select your project, then your branch, and then open the pipeline where you want to use the custom connector.
  12. Your API Extract component will be displayed as an "Unknown Component" in the migrated pipeline. Replace this component with the new custom connector. For more information about adding components to Data Productivity Cloud pipelines, read Add components to a pipeline.
  13. The schema, including the selected repeat element, is inferred from the imported profile. Click the custom connector component and configure the Load Selected Data property in the advanced settings to control what data is loaded.

Target platform-specific information

For Snowflake and Amazon Redshift data warehouses:

  • If Load Selected Data is set to No, and a repeat element is configured, the raw JSON is staged into a VARIANT column named Data Value in Snowflake, or a SUPER column named data in Redshift.
  • If a repeat element is not configured, data is loaded per schema definition. The first level of nesting is automatically unnested, which can give the appearance of data flattening.

To ensure compatibility in Matillion ETL so that your migrated transformation pipelines can use the data as-is, set Load Selected Data to No.

For Databricks warehouses:

  • Data is always loaded according to the defined schema.
  • There is no implicit flattening or nesting manipulation.

Best practices

  • Start migration with non-authenticated or non-paginated API Extract pipelines, which import cleanly.
  • Use the preview feature for custom connectors to validate your configuration before creating your pipelines.
  • Review your schema settings and adjust data type mappings if discrepancies occur.

Comparing API Extract and custom connectors

Feature API Extract Custom Connector Migration-relevant information
Profiles Configured in Matillion ETL Imported or created in custom connector -
Pagination Configured for each endpoint Configured for each endpoint, supports script pagination Not imported during migration and must be configured again
Headers/Parameters Manually defined Manually defined or inferred from imported profile -
Authentication Configured in API Extract component Defined once and assigned to each endpoint Not imported during migration and must be configured again
Flattening/Schema Basic JSON flattening Depends on repeat element configuration in Matillion ETL -
Load Selected Data Implicit from schema Must be configured in the Data Productivity Cloud Set to FALSE to match behavior in Matillion ETL, or set to TRUE as a refactor
Semi-structured handling Component-specific Warehouse-specific -