Skip to content

Microsoft SQL Server connector

You can ingest data from your Microsoft SQL Server using CDC pipelines, and replicate that data to a destination of your choice.

The Microsoft SQL Server connector captures row-level changes that occur in the schemas of a SQL Server database. The SQL Server connector connects to a SQL Server database or cluster, it takes a consistent snapshot of the schemas in the database. After the initial snapshot is complete, the connector continuously captures row-level changes for INSERT, UPDATE, or DELETE operations that are committed to the SQL Server databases that are enabled for CDC.

Note

  • Encryption is enabled by default. For more information, see the encryption setting in Microsoft SQL advanced settings.
  • The SQL Server CDC process will not capture computed values. These values will appear as NULL in the change table.
  • We are using the Debezium SQL Server connector to connect to this source. For more information, read Debezium connector for SQL Server.

Prerequisites

To use CDC on a SQL Server database, ensure:

  • The Microsoft SQL Server version is 13 (SQL Server 2016 SP1) or higher.
  • The Microsoft SQL Server Agent is running.
  • CDC is enabled on the SQL Server database, as described in Configuring MS SQL Database.

Connect to Microsoft SQL Server

When selecting Microsoft SQL Server as a source during pipeline creation, you will be required to provide the following information:

Property Description
Server address The server address of your Microsoft SQL Server database.
Port The port number used to access your Microsoft SQL Server database.
Database Name The name of your Microsoft SQL Server Container Database installation.
Username The username used to log in to the specified database.
Secret Provider Choose the provider for your secrets manager that contains your database password.
Secret Name Enter the name of the secret that corresponds to your database password. Read Manage secrets to learn how to add a named secret to your provider's secret manager.
Read Replica Specify whether the source database is an Always On read-only replica database.

Tables

Property Description
All Tables Tables available from your defined Microsoft SQL Server source will be displayed in Matillion CDC. These tables are defined and created by the user in their Microsoft SQL Server database and thus cannot be described here.
Tables to extract and load Selecting and moving a table to this column will include it in the CDC pipeline.

Understanding CDC for SQL Server

The following notes will assist you in understanding how CDC works for SQL Server, and help you troubleshoot any problems. For a full explanation of CDC for SQL Server, read the Microsoft documentation.

  • We do not interrogate the transaction log; instead, we use the change data capture in SQL Server.
  • A CDC schema will be created in the database that is enabled for CDC, and then system tables are created for the following:
    • cdc.captured_columns
    • cdc.change_tables
    • cdc.ddl_history
    • cdc.index_columns
    • cdc.lsn_time_mapping
  • Transparent Data Encryption (TDE) is handled and supported.
  • By default, three days of data is retained in the SQL Server change tables.
  • If it appears that the pipeline is not picking up data, you can validate that the internal CDC process is working by querying the change tracking tables (cdc.captured_columns) created in the database.

Next steps

You need to configure and set up your Microsoft SQL Server database. For detailed information on how to do this, checkout the Configuring Microsoft SQL Server database guide.