Configuring MySQL Database
  • Dark
    Light

Configuring MySQL Database

  • Dark
    Light

Overview

The following database setup steps are necessary to use the MySQL connector. These steps are based on advice that can be found in the MySQL documentation and we recommend reading that documentation thoroughly.


Step 1: Configure MySQL Binary Logs for Replication

A binary log is a set of log files that keep track of data and data object changes performed on a MySQL server instance. Binary logs are commonly used for data replication and recovery.

Matillion Data Loader supports binary log data ingestion for replication from MySQL servers (BinLog). Binary logging must be enabled on your MySQL server for this to work. This can be done using the MySQL server configuration file or the MySQL server starting options.

Follow these steps to set up BinLog replication:

  1. Check whether the log-bin option is already on:
  2. Access the MySQL database that you want to check for BinLog activity:
mysql -h hostname -u user -p database
  1. Open a secure shell and enter the command:
// for MySql 5.x
mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::"
FROM information_schema.global_variables WHERE variable_name='log_bin';
// for MySql 8.x
mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::"
FROM performance_schema.global_variables WHERE variable_name='log_bin';

If this statement returns a value of 1, BinLog is active. If value returned is 0, this means that BinLog is disabled. To enable it, follow the steps below.

  1. Enable BinLog replication. If it is OFF, configure your MySQL server configuration file with the following:
  2. Log in to your MySQL server instance.
  3. Check your MySQL Server configuration:
sudo nano /etc/mysql/my.cnf
Note

This may be sudo nano /etc/my.cnf in some cases.

  1. In the configuration file, ensure the following configurations are specified. If they aren't specified, add them now:
[mysqld]
binlog_format=ROW
binlog_row_image=FULL
expire_logs_days=10  -- The retention period (`expire_log_days`) can also be set in seconds by using the command: `binlog_expire_logs_seconds=259200`
log_bin=mysql-binlog  -- For ubuntu, use: `/var/log/mysql/mysql-bin.log` 
server-id=1 -- (only in the case of ubuntu) 
  1. Restart the MySQL server using the command service mysql restart on your instance.
  2. After restart, log in to the MySQL server to check BinLog again:
// for MySql 5.x
mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::"
FROM information_schema.global_variables WHERE variable_name='log_bin';
// for MySql 8.x
mysql> SELECT variable_value as "BINARY LOGGING STATUS (log-bin) ::"
FROM performance_schema.global_variables WHERE variable_name='log_bin';

The value returned is now 1. This indicates that BinLog is active.

Note

The retention period should ideally be at least 72 hours (3 days). This assists Matillion in ensuring that no log files are missing when Historical Data loading is enabled.


Descriptions of MySQL BinLog configuration properties

Property Description
server-id The value for the server-id must be unique for each server and replication client in the MySQL cluster. During MySQL connector set up, the CDC agent assigns a unique server ID to the connector.
log_bin The value of log_bin is the base name of the sequence of BinLog files.
BinLog_format The BinLog_format must be set to ROW or row.
BinLog_row_image The BinLog_row_image must be set to FULL or full.
expire_logs_days This is the number of days for automatic BinLog file removal. The default is 0, which means no automatic removal. Set the value to match the needs of your environment.

Step 2: Grant privileges to the user

The database user specified in the CDC Pipeline must have the following global privileges:

  • Select
  • SUPER or (REPLICATION_CLIENT and REPLICATION_SLAVE)
Note

The suggested grants above provide access to all schemas/tables - please check with your DBA for an appropriate mask.

To set up these privileges:

  • Go to your MySQL server. For example: mysql -u root -p.
  • Grant SELECT privileges to the user using following command:
GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON \*.* TO 'cdcuser'@'%';
Note

Please make sure the user name matches the user used in the CDC pipeline or vice versa.


Step 3: Specify MySQL Connection Settings

In the Configure your MySQL Source page in Matillion Data Loader, specify the following:

Property Description
Pipeline Name The name for your new pipeline. Must be unique.
Server address The server address of your MySQL database.
Port The port number used to access your MySQL database. By default it is 3306.
Database Name The name of your MySQL container database installation.
Username The username used to log in to the specified database.
Secret Provider Choose the provider for your secrets manager that contains your database password-AWS Secrets Manager, Azure Key Vault, or Google Secret Manager.
Secret Name Enter the name of the secret that corresponds to your database password.