-
DarkLight
Configuring PostgreSQL Database
-
DarkLight
Overview
The CDC Agent's PostgreSQL connector is able to monitor and consume changes as they're occurring within your PostgreSQL database. To achieve this, the connector uses PostgreSQLs logical decoding mechanism which is tracking and extracting all persistent changes to a database's tables. This process will therefore need to claim a replication slot to enable CDC. To ensure you are setup for CDC, please continue reading this setup guide.
These steps are based on advice that can be found in the PostgreSQL documentation, particularly concerning replication and the write-ahead log. We recommend reading that documentation thoroughly.
The PostgreSQL connector currently supports PostgreSQL versions 10 and above. These versions contain the pgoutput
logical decoding plugin natively so no further plugin installations are required.
WAL considerations for PostgreSQL
The WAL (Write-Ahead Log) doesn't take up a constant amount of disk space. It can grow as changes come in, and shrink over time, or as checkpoints are made. The PostgreSQL database will automatically attempt to reclaim disk space in blocks when that data is no longer needed.
In some cases, the WAL may grow larger than you expected. While the WAL is shared by all databases, tables and schemas, your CDC agent may only track a select few. This means the WAL can quickly fill with untracked changes, particularly if your untracked resources have high traffic while your tracked resources have low traffic. Configuring the heartbeat for your CDC agent so your agent makes regular updates to a table in the database can help avoid this issue.
What you will need
To complete the steps detailed in this guide you will need:
- An account on PostgreSQL. This is most commonly a super user account, but it does not have to be.
- An understanding of which database and schemas you will be targeting for CDC.
- A list of tables which you want to monitor and capture changes for.
Setting up PostgreSQL for CDC
The initial requirement is to ensure that the wal_level
, max_wal_senders
, and max_replication_slots
are set appropriately to allow the connector to establish a connection and claim a replication slot. There are two different approaches to configuring these values, as follows:
Option 1: Update the postgresql.conf
file
This configuration file can be updated to set all three properties. For logical decoding, the wal_level
must be set to logical
and there must be one available replication slot which can be claimed by the connector. An example configuration is included below:
# REPLICATION
wal_level = logical -- instructs the server to use logical decoding with the write-ahead log
max_wal_senders = 1 -- use a maximum of 1 separate process for processing WAL changes| default: 10
max_replication_slots = 1 -- allow a maximum of 1 replication slot to be created for streaming WAL changes| default: 10
You can run the SHOW config_file
snippet via PostgreSQL to view the location of this file.
Once the above values are finalised and saved, a restart is required to apply any changes made.
Option 2: Update the system properties via SQL
Alternatively, you can use SQL to update these properties. Once updated, a restart is still required to apply the changes. An example SQL script is provided below:
SHOW wal_level -- can be used with max_wal_senders and max_replication_slots
ALTER SYSTEM SET wal_level = logical -- can be used with max_wal_senders and max_replication_slots
Once applied, the database server will be ready for CDC.
Configuration instructions vary slightly on RDS instances. Please follow the steps below to configure your RDS PostgreSQL server:
The RDS instance parameter rds.logical_replication
needs to be set to 1
. Applying this change will require a restart. You may decide to apply the changes immediately or, if this is not possible, wait for a planned maintenance window. Once applied, you can query SHOW wal_level
to ensure the property is set to logical
.
For PostgreSQLs write-ahead logs, the default setting for wal_keep_segments
is 0 - this setting configures the size of WAL files held for standby servers. If your retention value is too low and an outage is experienced, change data capture may not be able to recover. WAL segments may already have moved beyond stored position. This is typically indicated with a startup error as follows: ERROR: requested WAL segment 000000010000000000000001 has already been removed
.
If this were to occur, it will be necessary to for the CDC agent to take a new snapshot of the database.
Replication slots are guaranteed to retain all WAL segments. It is important to closely monitor replication slots to avoid excessive disk consumption and other conditions that may occur, such as catalog bloat, if a replication slot stays unused for too long. Reviewing the official documentation for the mechanics and configuration of write-ahead logs is recommended.
Setting up permissions
Once the server configuration has been completed you will need to setup an appropriate role within the database for the CDC agent to use. Roles in PostgreSQL are synonymous with users. Read the PostgreSQL documentation for details.
It is not recommended to provide elevated privileges or superuser permissions for this role and the master account should not be used for CDC purposes.
To setup a role with the correct permissions, please follow the steps below:
A superuser will need to complete the steps below.
- Create a new role with
REPLICATION
andLOGIN
permissions.
CREATE ROLE cdcrole WITH REPLICATION LOGIN PASSWORD 'xxx' -- a suitable password should be provided
For RDS PostgreSQL, use the following commands:
CREATE ROLE cdcrole WITH LOGIN PASSWORD 'XXXX';
GRANT rds_replication TO cdcrole;
- The new role will initially need to be granted
USAGE
permissions on any schema you are intending to use within your CDC pipeline.
GRANT USAGE ON SCHEMA public TO cdcrole;
- This role will need
SELECT
permissions on any tables you wish to perform CDC against. This is required for thesnapshot
stage of the process. If you don't need perform a snapshot,SELECT
permissions are not required.
-- Grant the new user with permissions to run SELECT on the CDC marked tables
GRANT SELECT ON public.phone TO cdcrole;
GRANT SELECT ON public.customer TO cdcrole;
-- You can also grant ALL TABLES to the user if required
GRANT SELECT ON ALL TABLES IN SCHEMA public TO cdcrole;
- A new publication will need to be created with the name
matillion_cdc_publication
. All CDC marked tables should be added to this publication. This publication must exist for theSnapshotting
stage of the process. An example is included below:
CREATE PUBLICATION matillion_cdc_publication FOR TABLE public.phone, public.customer
-- The default publish setting is appropriate in this case
- Finally, you will need to allow replication with the agent host and the new role created earlier. To do this you need to add a new entry in the PosrgreSQL host authentication file,
pg_hba.conf
. An example configuration is included below:
############ REPLICATION ##############
local replication cdcrole trust -- allow replication for `postgres` locally
host replication cdcrole 127.0.0.1/32 trust -- receive replication changes using `IPV4`
host replication cdcrole ::1/128 trust -- receive replication changes using `IPV6`
A configuration reload is required at this point - this can be achieved using SELECT pg_reload_conf()
or pg_ctl reload
via the CLI.
Read the PostgreSQL documentation for more information on network masks.
RDS replication permissions require a slightly varied process:
Your RDS instance will have automatically created an rds_replication
role. You will not be able to assign the REPLICATION
option directly to your new user - instead, you can grant the rds_replication
role to your user. You must have superuser
access to perform this action.
CREATE ROLE cdcrole WITH LOGIN PASSWORD 'xxx'
-- Assign the rds_replication role to the new user
GRANT rds_replication TO cdcrole
PostgreSQL Pipeline Check for Replication Slot
If your pipeline did not go into the Snapshotting
status and instead remained in the Not Running
status, this may be caused by the Replication Slot creation not completing as expected.
One of the most apparent reasons for this occurring is an exclusive lock held on one of the tables included in the pipeline selection, this then causes the Agent to wait for the Replication Slot to be fully generated.
This can be identified by running the following commands. Firstly, to view active SQL queries within the PostgreSQL instance, you execute:
SELECT *
FROM pg_stat_activity
This query looks at pg_stat_activity
which is a system view that allows you to identify active SQL queries in the PostgreSQL instance
If the above command shows that CREATE_REPLICATION_SLOT
command is still running, the pg_locks
view should be examined using the related PID:
SELECT *
FROM pg_locks
WHERE pid = ?
If any locks have been identified as present for the given PID, use the associated transaction ID from the above to interrogate the pg_locks
view once more:
SELECT *
FROM pg_locks
WHERE transactionid = ?
These queries use the pg_locks
view, which shows active processes and the locks they hold within the PostgreSQL instance
When executing the above command for a given transaction ID, if there are multiple locks returned by separate processes, It has likely identified a situation where the replication slot generation is blocked by another process. You can look back at the pg_stat_activity
view using the PID returned from this command to identify why the execution has not completed:
SELECT *
FROM pg_stat_activity
WHERE pid = ?
From the output of this command, it should be apparent if other activities are holding locks on the tables included in the pipeline.
You're required to wait for the exclusive lock process to complete to get pipeline into "Snapshotting".
Next steps
With the above configuration items completed, you should now be in a position to connect the CDC agent to your PostgreSQL database. Please consult the Matillion Data Loader Pipeline UI guide for next steps.