Edit a CDC pipeline
Matillion Data Loader lets you modify existing pipelines to meet your requirements, rather than delete your pipeline and start over.
Editing a pipeline allows you to:
- Edit connection parameters.
- Change schemas.
- Add or remove a table.
- Edit the chosen destination configurations.
- Disable a snapshot.
You must clear your prefix folder (if using the same prefix) of all files within your chosen cloud storage location. Editing a pipeline will require a snapshot to be taken again (if in use).
This article describes the steps required to accomplish this.
From either the pipeline dashboard, or a given pipeline’s details page, click ... and select Edit pipeline from the dialog menu.
Confirm that you understand and make sure you 'clear out the files from the cloud storage location you are intended to use' and click on I Understand.
Edit source connection details
You will be redirected to the source connection page, where you can make edits to your database connection details such as server address, username, and the secret name that you have defined. Once completed, click Test and Continue.
You cannot switch to the different database source. However, you can make changes in the connection details of the existing database.
On this page, you can select any schema of your choice from which you would like to load the tables. Use the arrow buttons to move schemas to the Selected schemas listbox, and then reorder any schema with click-and-drag. You can select multiple schema using the
SHIFT key. Click Continue with X schema to move forward.
You will be redirected to choose tables, you can select any table you wish to include in the pipeline. Use the arrow buttons to move tables and schemas to the Tables to extract and load listbox, and then reorder any tables with click-and-drag. You can select multiple tables using the
SHIFT key. Click Continue with X tables to move forward.
You must ensure that the selected tables are enabled for CDC in the source. The requirement for this will depend on the data source being used, and is described in the documentation for each source.
Edit destination connection details
On the connect to destination page, you can edit the connection details of the destination and click Test and continue.
Edit pipeline settings
On the pipeline settings page, you can edit the pipeline name, and you can disable/enable the Snapshot Database for the source and then click Continue.
The pipeline summary page lets you review the selections you have made in each of the previous stages. You can return to any earlier stage to make adjustments if required.
The summary is divided into the following sections:
- Agent Details
- Source Details
- Selected Tables
- Destination Details
- Pipeline Settings
When you're satisfied with your selections, click Update pipeline to complete the process.
This page is also displayed at the final stage of the pipeline creation process. See Matillion Data Loader Pipeline UI for more information.
A dialog box will pop up for the confirmation, if you are satisfied with the changes and have made sure the intended cloud storage is empty, click I understand, update.
Read Pipeline Dashboard Overview for more details about using this interface.
Disable and clear the shared job
If the pipeline has a shared job currently in use in Matillion ETL, its schedule must be disabled. Read Manage Schedules for details about how to do this.
Additionally, if the Change Log Transformation shared job is in use, you may want to clear data to avoid duplication. To do this, run the Drop CDC Tables shared job with the property Actually Drop The Tables set to Y. This will drop the CDC tables currently in the target database.
We advise that you run the Drop CDC Tables shared job with Actually Drop The Tables set to N first, then check the task history to see which tables would be dropped. If you are happy that this is correct, you can run the job again with the property set to Y to actually drop the tables.
You don't need to clear down the tables when using the Copy Table transformation, as it merges new events into the target table using the primary key and therefore tries to update all the existing records.