Oracle streaming connector

The Oracle connector can monitor and record the row-level changes in your Oracle database. The connector ingests change events using Oracle's native LogMiner database package. Oracle LogMiner is part of the Oracle database utilities and provides an interface for querying online and archived redo log files.

Note

We are using the Debezium Oracle connector to connect to this source. For more information, read Debezium connector for Oracle.

How the connector works

Your Oracle servers may not have been configured to retain the complete history of changes within the Oracle redo logs. As a result, it will be necessary to take a consistent snapshot of the database before moving on to the streaming stage. This will ensure that streaming will only begin once the history of your schema or tables has been captured.

Note

Depending on the number of rows within your schema, this may take a while to complete.

The snapshotting process involves the following actions against your streaming monitored tables:

Obtain a ROW SHARE MODE lock on each of the monitored tables to ensure that no structural changes can occur.
Read the current system change number (SCN) position in the server's redo log.
Capture the structure of all relevant tables.
Release the locks obtained earlier (these locks are only held for a short period of time).
Generate change events for each row (as valid at the SCN captured earlier).

Once the snapshot stage has completed, the connector will move to the streaming stage. All changes that have occurred since the snapshot started will be captured during the streaming process—no changes will be missed.

The streaming stage will continue to monitor and consume changes as and when they're occurring within the database. Any produced change events will be exported to your selected data lake in a consistent and predictable pattern. The connector will maintain an acceptable time lag behind the source database—this lag can be monitored through the pipeline dashboard.

Note

The connector is tolerant of failures. As the connector reads changes and produces events, it records the SCN position for each successful event captured.
If the connector stops for any reason—including communication failures, network problems, or crashes—then upon restart, the connector will continue streaming where it left off.
If the connector stopped at any point during the snapshot stage, a new snapshot will be taken on restart of the pipeline.

Prerequisites

Observe the following information about supported versions, limitations, and database rules.

Versions

Streaming supports Oracle versions 18c and above (including minor versions).

Limitations

The Oracle connector does not support either logical or physical standby databases.
Oracle Autonomous Database isn't supported.

Database rules

Your Oracle database must be configured for streaming. For details, see Configuring Oracle database.
When there are 1000 or more records in a table, set defaultRowPrefetch to 1000 in the Advanced settings.

Source setup

Refer to this section when you create a streaming pipeline.

Server address = string

The server address of your Oracle database.

Port = integer

The port number used to access your Oracle database. The default is 1521.

Single or Container database (CDB) = string

If you've got a single database for your source enter it here, otherwise enter CDB database for CDB/PDB architecture.

Pluggable Database (PDB) = string

The name of the PDB to be used that's contained within the CDB named above. Leave blank if your CDB does not contain a PDB. PDB names containing special characters are not supported.

Username = string

The username used to log in to the specified database.

Secrets Manager = drop-down

Choose the service you use to manage secrets.

Secret name = string

The name of the secret in your secret manager service that references your Oracle password.

JDBC parameters and specific connection settings = column editor

This parameter is optional.

Specify any parameter:value pairs as part of an advanced connection. Click Save to finish adding parameters.

Click Connect to establish the connection to your source database.

Low change frequency sources

The Debezium Oracle connector tracks system change numbers in the connector offsets, so that when the connector is restarted it can begin where it left off. However, in a low change frequency environment, these change numbers may become stale due to older numbers not being retained as part of the local log retention policy. To avoid this, you can set up a heartbeat table and define an interval for inserting a record into that table. This will ensure that changes are regularly applied to avoid this issue. For more information, read heartbeat.interval.ms and heartbeat.action.query.

You will also need to add supplemental logging for your heartbeat table when you configure your Oracle database.

Note

A "heartbeat table" is a standard Oracle table that is built to create a frequent and consistent stream of data to prevent log staleness. For more information, read Low change frequency offset management.
The heartbeat table will need supplemental logging specifically added for it, as described in step 3 of section Configuring your Oracle database for streaming in the related page Oracle database configuration.
The heartbeat table will also need to be added to the list of tables to be captured.

Advanced settings

Advanced settings are optional. Advanced setting are entered as parameter:value pairs in the Add advanced settings dialog. Some parameters you might want to add are discussed below.

Note

For a full list of the configuration properties, read the Debezium documentation.

RAC enabled applications

If your Oracle configuration is RAC enabled, set the rac.nodes parameter accordingly.

defaultRowPrefetch

Note

defaultRowPrefetch is an Oracle JDBC property. For more information, read getDefaultRowPrefetch.

You may wish to set defaultRowPrefetch. This sets the number of rows to prefetch from the server. The default is 10. When there are 1000 or more records in a table, set defaultRowPrefetch to 1000 in the advanced settings. The default value of 10 does not allow retrieving all records before the connection times out.

rac.nodes

Set to 192.168.1.100,192.168.1.101:1522.

A comma-separated list of Oracle Real Application Clusters (RAC) node host names or addresses. This field is required to enable compatibility with an Oracle RAC deployment. The node will assume the default port as set when connecting to Oracle. If that port needs to be overridden, add the IP address and port number as per the example. For more information, read rac.nodes.

log.mining.transaction.retention.ms

Set to 2000000.

A positive integer value that specifies the number of milliseconds to retain long running transactions between redo log switches. When set to 0, transactions are retained until a commit or rollback is detected. Any transaction that exceeds this configured value is discarded entirely, and the connector does not emit any messages for the operations that were part of the transaction. For more information, read log.mining.transaction.retention.ms.

heartbeat.interval.ms

Set to 20000.

Specifies, in milliseconds, how frequently the connector sends messages to a heartbeat topic. Use this property to determine whether the connector continues to receive change events from the source database. It can also be useful to set the property in situations where no change events occur in captured tables for an extended period. For more information, read heartbeat.interval.ms.

heartbeat.action.query

    INSERT INTO <HEARTBEAT_TABLE_NAME> (<COLUMN_NAME>) VALUES ('<SAMPLE_TEXT>');

Set this property and create a heartbeat table to receive the heartbeat messages to resolve situations in which Debezium fails to synchronize offsets on low-traffic databases that are on the same host as a high-traffic database. For more information, read heartbeat.action.query.

Configuring new tables

When adding tables to a pipeline, they must be correctly configured to allow the on-demand snapshot to take place and to ensure future changes on those tables will be streamed. Read Configure your Oracle database for a more comprehensive outline of these requirements.

If required, navigate to the specific pluggable database (PDB) where the new tables belong. Use the following SQL command, replacing <pdb> with your PDB name:
```
ALTER SESSION SET CONTAINER=<pdb>;
```
For each new table added to the pipeline, enable supplemental logging to capture the required information by Oracle. Use the following SQL command:
```
    -- Run for each table added to a pipeline
    ALTER TABLE <schema>.<table> ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
```
Replace <schema> and <table> with the appropriate schema and table names for each new table.
Verify the status of logging for all tables owned by a given user, using the following SQL query:
```
SELECT OWNER, LOG_GROUP_NAME, TABLE_NAME, LOG_GROUP_TYPE, "ALWAYS", "GENERATED"
FROM DBA_LOG_GROUPS
WHERE OWNER = '<username>';
```
Replace <username> with the name of the user that the new tables belong to.
Check streaming user permissions. If you have configured the streaming user, you shouldn't need to grant any new permissions. Any permissions previously granted to the streaming user on individual tables should also apply to the new tables.

Got feedback or spotted something we can improve?

We'd love to hear from you. Join the conversation in the Documentation forum!