Configure your MySQL database
The steps in this guide for configuring a MySQL database are necessary to use the MySQL connector. These steps are based on advice in the MySQL documentation. It's recommended that you familiarize yourself with this before continuing.
Prerequisites
- When setting up MySQL, you need a MySQL user with the following permissions:
- SELECT
- RELOAD
- SHOW DATABASES
- CREATE
- ALTER
- DROP (optional)
A sample SQL statement to create a user with the necessary permissions:
CREATE USER 'your_user'@'%' IDENTIFIED BY 'your_password';
GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'your_user' IDENTIFIED BY 'your_password';
FLUSH PRIVILEGES;
Configure MySQL binary logs for replication
Binary log (BinLog) replication must be enabled.
A binary log is a set of log files that keep track of data and data object changes performed on a MySQL server instance. Binary logs are commonly used for data replication and recovery.
The Data Productivity Cloud supports binary log data ingestion for replication from MySQL servers (BinLog). Binary logging must be enabled on your MySQL server for this to work. This can be done using the MySQL server configuration file or the MySQL server starting options.
Follow these steps to set up BinLog replication.
First, check whether the log-bin
option is already on.
-
Access the MySQL database that you want to check for BinLog activity:
mysql -h hostname -u user -p database
-
Open a secure shell and enter the command:
// for MySql 8.x mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::" FROM performance_schema.global_variables WHERE variable_name='log_bin';
If this statement returns a value of 1
, BinLog is active. If the value returned is 0
, this means that BinLog is disabled. To enable it, follow the steps below.
- Log in to your MySQL server instance.
-
Check your MySQL Server configuration:
sudo nano /etc/mysql/my.cnf
Note
In some cases, this may be
sudo nano /etc/my.cnf
. -
In the configuration file, ensure the following configurations are specified. If they aren't specified, add them now:
[mysqld] binlog_format=ROW binlog_row_image=FULL expire_logs_days=10 -- The retention period (`expire_log_days`) can also be set in seconds by using the command: `binlog_expire_logs_seconds=259200` log_bin=mysql-binlog -- For ubuntu, use: `/var/log/mysql/mysql-bin.log` server-id=1 -- (only in the case of ubuntu)
-
Restart the MySQL server using the command
service mysql restart
on your instance. -
After a restart, log in to the MySQL server to check the BinLog again:
// for MySql 8.x mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::" FROM performance_schema.global_variables WHERE variable_name='log_bin';
The value returned should now be
1
. This indicates that BinLog is active.Note
The retention period should ideally be at least 72 hours (3 days). This assists Matillion in ensuring that no log files are missing when historical data loading is enabled.
Descriptions of MySQL BinLog configuration properties
Property | Description |
---|---|
server-id | The value for the server-id must be unique for each server, and replication client in the MySQL cluster. During the MySQL connector set up, the streaming agent assigns a unique server ID to the connector. |
log_bin |
The value of log_bin is the base name of the sequence of the BinLog files. |
BinLog_format | The BinLog_format must be set to ROW or row . |
BinLog_row_image |
The BinLog_row_image must be set to FULL or full . |
expire_logs_days |
The number of days for an automatic BinLog file removal. The default is 0 , which means no automatic removal. Set the value to match the needs of your environment. |
Grant privileges to the user
-
Check that the database user specified in the streaming pipeline has the following global privileges:
Select
SUPER
or (REPLICATION_CLIENT
andREPLICATION_SLAVE
)
Note
The suggested grant privileges above provide access to all schemas and tables. Check with your database administrator for an appropriate mask.
-
There are two methods for setting up these privileges:
-
Go to your MySQL server. For example:
mysql -u root -p
, and grantSELECT
privileges to the user, using the following command:GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON \*.* TO 'cdcuser'@'%';
Note
The username must match the user specified in the streaming pipeline, or vice versa.
-
If the MySQL database is deployed via RDS, grant the authenticated user of the agent an additional permission called
LOCK TABLES
, using the following command:GRANT SELECT, RELOAD, SHOW DATABASES, LOCK TABLES, REPLICATION SLAVE, REPLICATION CLIENT ON \*.* TO 'cdcuser'@'%';
Note
If you're deploying the MySQL database using Amazon RDS or Amazon Aurora that doesn't allow a global read lock, table-level locks are used to create a consistent snapshot.
-