CDC Shared Job
Matillion ETL provides support for Change Data Capture (CDC) and facilitates data loading into your chosen cloud storage area through shared jobs.
The Streaming agent, which is installed on your source database, is responsible for loading the data into your preferred cloud storage area. You have the option to choose from Amazon S3, Google Cloud Storage, or Azure Blob Storage as your cloud storage destination.
By leveraging Matillion ETL's CDC capabilities and the Streaming agent, you can efficiently capture changes made to your source database and load them into the cloud storage area of your choice. This enables you to keep your data up-to-date and readily available for further processing or analysis.
The shared jobs provided by Matillion ETL can read the data from cloud storage and load it into target tables on your chosen data platform. Depending on your configuration and the selected shared job, you can load a single table or all tables into the target.
Detailed configuration guides and documentation are available for setting up shared jobs for your specific use case and cloud data platform. Read CDC Shared Jobs.
Schema drift support
Matillion ETL also supports schema drift, which refers to changes in the structure or schema of the source tables. The shared jobs include a parameter called Schema Drift Action that allows you to toggle the behavior for handling schema drift. You can find detailed information on how to configure this functionality in the appropriate configuration guide for your specific use case.
Schema drift support in Matillion ETL is designed to handle changes in the source data schema. Here are the key aspects of schema drift support:
- Missing Columns: If the source schema changes and some columns are no longer present, Matillion ETL will load those missing columns as NULL values in the target tables.
- Data Type Changes: Schema drift support also accommodates changes in data types for specific columns. You can configure the shared job to handle these data type changes during the loading process.
- Missing Tables: In case a table is no longer present in the source schema, Matillion ETL will skip loading that particular table. However, if any missing tables are specified in the shared job configuration, the shared job will fail. To resolve this, you need to edit the shared job's Table and Columns grid variable to remove the missing table from the configuration.
To learn more, read Schema drift.