Configuring MySQL database
The steps in this guide for configuring a MySQL database are necessary to use the MySQL connector. These steps are based on advice in the MySQL documentation. It's recommended that you familiarize yourself with this before continuing.
Prerequisites
- When setting up with MySQL, you need a MySQL user with the following permissions:
- SELECT
- RELOAD
- SHOW DATABASES
- CREATE, ALTER, and DROP (optional)
A sample SQL statement to create a user with the necessary permissions:
CREATE USER 'your_user'@'%' IDENTIFIED BY 'your_password';
GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'user' IDENTIFIED BY 'password';
FLUSH PRIVILEGES;
Step 1: Configure MySQL Binary Logs for Replication
A binary log is a set of log files that keep track of data, and data object changes performed on a MySQL server instance. Binary logs are commonly used for data replication and recovery.
Data Loader supports binary log data ingestion for replication from MySQL servers (BinLog). Binary logging must be enabled on your MySQL server for this to work. This can be done using the MySQL server configuration file or the MySQL server starting options.
Follow these steps to set up BinLog replication:
-
Check whether the
log-bin
option is already on. -
Access the MySQL database that you want to check for BinLog activity:
mysql -h hostname -u user -p database
-
Open a secure shell and enter the command:
// for MySql 5.x mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::" FROM information_schema.global_variables WHERE variable_name='log_bin';
// for MySql 8.x mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::" FROM performance_schema.global_variables WHERE variable_name='log_bin';
If this statement returns a value of
1
, BinLog is active. If the value returned is0
, this means that BinLog is disabled. To enable it, follow the steps below. -
Enable BinLog replication. If it's
OFF
, configure your MySQL server configuration file with the following:- Log in to your MySQL server instance.
- Check your MySQL Server configuration:
sudo nano /etc/mysql/my.cnf
Note
In some cases, this may be
sudo nano /etc/my.cnf
. -
In the configuration file, ensure the following configurations are specified. If they aren't specified, add them now:
sql [mysqld] binlog_format=ROW binlog_row_image=FULL expire_logs_days=10 -- The retention period (`expire_log_days`) can also be set in seconds by using the command: `binlog_expire_logs_seconds=259200` log_bin=mysql-binlog -- For ubuntu, use: `/var/log/mysql/mysql-bin.log` server-id=1 -- (only in the case of ubuntu)
-
Restart the MySQL server using the command
service mysql restart
on your instance. -
After a restart, log in to the MySQL server to check the BinLog again:
// for MySql 5.x mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::" FROM information_schema.global_variables WHERE variable_name='log_bin';
// for MySql 8.x mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::" FROM performance_schema.global_variables WHERE variable_name='log_bin';
The value returned is now
1
. This indicates that BinLog is active.Note
The retention period should ideally be at least 72 hours (3 days). This assists Matillion in ensuring that no log files are missing when Historical Data loading is enabled.
Descriptions of MySQL BinLog configuration properties
Property | Description |
---|---|
server-id | The value for the server-id must be unique for each server, and replication client in the MySQL cluster. During the MySQL connector set up, the Streaming agent assigns a unique server ID to the connector. |
log_bin |
The value of log_bin is the base name of the sequence of the BinLog files. |
BinLog_format | The BinLog_format must be set to ROW or row . |
BinLog_row_image |
The BinLog_row_image must be set to FULL or full . |
expire_logs_days |
The number of days for an automatic BinLog file removal. The default is 0 , which means no automatic removal. Set the value to match the needs of your environment. |
Step 2: Grant privileges to the user
-
Check that the database user specified in the CDC Pipeline has the following global privileges:
Select
SUPER
or (REPLICATION_CLIENT
andREPLICATION_SLAVE
)
Note
The suggested grant privileges above provide access to all schemas and tables. Check with your DBA for an appropriate mask.
-
There are two methods for setting up these privileges:
-
Go to your MySQL server. For example:
mysql -u root -p
, and grantSELECT
privileges to the user, using the following command:GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON \*.* TO 'cdcuser'@'%';
Note
The username must match the user specified in the CDC pipeline, or vice versa.
-
If the MySQL database is deployed via RDS, grant the authenticated user of the agent an additional permission called
LOCK TABLES
, using the following command:GRANT SELECT, RELOAD, SHOW DATABASES, LOCK TABLES, REPLICATION SLAVE, REPLICATION CLIENT ON \*.* TO 'cdcuser'@'%';
Note
If you're deploying the MySQL database using Amazon RDS or Amazon Aurora that doesn't allow a global read lock, table-level locks are used to create a consistent snapshot.
-
Step 3: Specify MySQL Connection Settings
In the Configure your MySQL Source page in Data Loader, specify the following:
Property | Description |
---|---|
Server address | The server address of your MySQL database. |
Port | The port number used to access your MySQL database. By default this is 3306 . |
Database Name | The name of your installed MySQL container database. |
Username | The username used to log in to the specified database. |
Secret Provider | Choose the provider for your secrets manager that contains your database password: AWS Secrets Manager, Azure Key Vault, or Google Secret Manager. |
Secret Name | Enter the name of the secret that corresponds to your database password. |