Oracle connector
The Streaming agent's Oracle connector can monitor and record the row-level changes in your Oracle database. The connector ingests change events using Oracle's native LogMiner database package. Oracle LogMiner is part of the Oracle database utilities and provides an interface for querying online and archived redo log files.
Note
- Matillion CDC supports Oracle versions
18c
and above (including minor versions). - Your Oracle database must be configured for CDC. For details, see Configuring Oracle database.
- The Oracle connector does not support either logical or physical standby databases.
- We are using the Debezium Oracle connector to connect to this source. For more information, read Debezium connector for Oracle.
Connect to Oracle
When selecting Oracle as a source during pipeline creation, you will be required to provide the following information:
Property | Description |
---|---|
Server address | The server address of your oracle database. |
Port | The port number used to access your oracle database. |
Container Database (CDB) Name | The name of your Oracle Container Database installation. |
Pluggable Database (PDB) Name | The name of the PDB to be used that's contained within the CDB named above. Leave blank if your CDB does not contain a PDB. PDB names containing special characters are not supported. |
Username | The username used to log in to the specified database. |
Secret Provider | Choose the provider for your secrets manager that contains your database password. |
Secret Name | Enter the name of the secret that corresponds to your database password. Read Manage secrets to learn how to add a named secret to your provider's secret manager. |
Tables
When configuring the tables for CDC, you will see two columns:
Property | Description |
---|---|
All Tables | Tables available from your defined Oracle source will be displayed in Data Loader CDC. These tables are defined and created by the user in their Oracle database and thus cannot be described here. |
Tables to extract and load | Selecting and moving a table to this column will include it in the CDC pipeline. |
How it works
Most Oracle servers may not have been configured to retain the complete history of changes within the Oracle redo logs. As a result, it will be necessary to take a consistent snapshot of the database before moving on to the streaming stage. This will ensure that streaming will only begin once the history of your schema or tables has been captured.
Note
Depending on the number of rows within your schema, this may take a while to complete.
If required, the snapshot stage can be excluded if you wish to omit historic data from your CDC pipeline by disabling snapshots in the Pipeline settings.
The snapshotting process involves the following actions against your CDC monitored tables:
- Obtain a
ROW SHARE MODE
lock on each of the monitored tables to ensure that no structural changes can occur. - Read the current system change number (SCN) position in the server's redo log.
- Capture the structure of all relevant tables.
- Release the locks obtained earlier (these locks are only held for a short period of time).
- Generate change events for each row (as valid at the SCN captured earlier).
Note
Once the snapshot stage has completed, the connector will move to the streaming stage. All changes that have occurred since the snapshot started will be captured during the streaming process—no changes will be missed.
The streaming stage will continue to monitor and consume changes as and when they're occurring within the database. Any produced change events will be exported to your selected data lake in a consistent and predictable pattern. The connector will maintain an acceptable time lag behind the source database—this lag can be monitored through the Data Loader UI.
Note
- The connector is tolerant of failures. As the connector reads changes and produces events, it records the SCN position for each successful event captured.
- If the connector stops for any reason—including communication failures, network problems, or crashes—then upon restart, the connector will continue streaming where it left off.
- If the connector stopped at any point during the snapshot stage, a new snapshot will be taken on restart of the pipeline.