Skip to content

Configuring MySQL database

The steps in this guide for configuring a MySQL database are necessary to use the MySQL connector. These steps are based on advice in the MySQL documentation. It's recommended that you familiarize yourself with this before continuing.


Step 1: Configure MySQL Binary Logs for Replication

A binary log is a set of log files that keep track of data, and data object changes performed on a MySQL server instance. Binary logs are commonly used for data replication and recovery.

Data Loader supports binary log data ingestion for replication from MySQL servers (BinLog). Binary logging must be enabled on your MySQL server for this to work. This can be done using the MySQL server configuration file or the MySQL server starting options.

Follow these steps to set up BinLog replication:

  1. Check whether the log-bin option is already on.

  2. Access the MySQL database that you want to check for BinLog activity:

    mysql -h hostname -u user -p database
    
  3. Open a secure shell and enter the command:

    // for MySql 5.x
    mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::"
    FROM information_schema.global_variables WHERE variable_name='log_bin';
    
    // for MySql 8.x
    mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::"
    FROM performance_schema.global_variables WHERE variable_name='log_bin';
    

    If this statement returns a value of 1, BinLog is active. If the value returned is 0, this means that BinLog is disabled. To enable it, follow the steps below.

  4. Enable BinLog replication. If it's OFF, configure your MySQL server configuration file with the following:

    1. Log in to your MySQL server instance.
    2. Check your MySQL Server configuration:
    sudo nano /etc/mysql/my.cnf
    

    Note

    In some cases, this may be sudo nano /etc/my.cnf.

  5. In the configuration file, ensure the following configurations are specified. If they aren't specified, add them now:

    sql [mysqld] binlog_format=ROW binlog_row_image=FULL expire_logs_days=10 -- The retention period (`expire_log_days`) can also be set in seconds by using the command: `binlog_expire_logs_seconds=259200` log_bin=mysql-binlog -- For ubuntu, use: `/var/log/mysql/mysql-bin.log` server-id=1 -- (only in the case of ubuntu)

  6. Restart the MySQL server using the command service mysql restart on your instance.

  7. After a restart, log in to the MySQL server to check the BinLog again:

    // for MySql 5.x
    mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::"
    FROM information_schema.global_variables WHERE variable_name='log_bin';
    
    // for MySql 8.x
    mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::"
    FROM performance_schema.global_variables WHERE variable_name='log_bin';
    

    The value returned is now 1. This indicates that BinLog is active.

    Note

    The retention period should ideally be at least 72 hours (3 days). This assists Matillion in ensuring that no log files are missing when Historical Data loading is enabled.


Descriptions of MySQL BinLog configuration properties

Property Description
server-id The value for the server-id must be unique for each server, and replication client in the MySQL cluster. During the MySQL connector set up, the CDC agent assigns a unique server ID to the connector.
log_bin The value of log_bin is the base name of the sequence of the BinLog files.
BinLog_format The BinLog_format must be set to ROW or row.
BinLog_row_image The BinLog_row_image must be set to FULL or full.
expire_logs_days The number of days for an automatic BinLog file removal. The default is 0, which means no automatic removal. Set the value to match the needs of your environment.

Step 2: Grant privileges to the user

  1. Check that the database user specified in the CDC Pipeline has the following global privileges:

    • Select
    • SUPER or (REPLICATION_CLIENT and REPLICATION_SLAVE)

    Note

    The suggested grant privileges above provide access to all schemas and tables. Check with your DBA for an appropriate mask.

  2. There are two methods for setting up these privileges:

    1. Go to your MySQL server. For example: mysql -u root -p, and grant SELECT privileges to the user, using the following command:

      GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON \*.* TO 'cdcuser'@'%';
      

      Note

      The username must match the user specified in the CDC pipeline, or vice versa.

    2. If the MySQL database is deployed via RDS, grant the authenticated user of the agent an additional permission called LOCK TABLES, using the following command:

      GRANT SELECT, RELOAD, SHOW DATABASES, LOCK TABLES, REPLICATION SLAVE, REPLICATION CLIENT ON \*.* TO 'cdcuser'@'%';
      

      Note

      If you're deploying the MySQL database using Amazon RDS or Amazon Aurora that doesn't allow a global read lock, table-level locks are used to create a consistent snapshot.


Step 3: Specify MySQL Connection Settings

In the Configure your MySQL Source page in Data Loader, specify the following:

Property Description
Server address The server address of your MySQL database.
Port The port number used to access your MySQL database. By default this is 3306.
Database Name The name of your installed MySQL container database.
Username The username used to log in to the specified database.
Secret Provider Choose the provider for your secrets manager that contains your database password: AWS Secrets Manager, Azure Key Vault, or Google Secret Manager.
Secret Name Enter the name of the secret that corresponds to your database password.