-
DarkLight
MDL Dashboard Overview
-
DarkLight
Overview
Matillion Data Loader is an application for the bulk import or export of data using two functionalities: Batch pipelines and Change Data Capture (CDC) pipelines.
- Batch: Batch Load Replication provides a full SaaS incremental data loading experience to extract, transform, and load data at user-specified time intervals from chosen sources, to cloud data warehouse/platform destinations, such as Snowflake, Amazon Redshift, Delta Lake on Databricks, and Google BigQuery.
- Change Data Capture: Identify and capture changes made in real-time to a data source, and ensure those changes are loaded to a destination storage location, such as Amazon S3, Azure Blob, and Google Cloud Storage.
Data flow in pipeline
A pipeline moves your data from a source system to a destination database. Pipelines help you to replicate raw data from your source application or database to a destination database or data warehouse. You can customize the data replication process to suit the type of source, and the destination.
The Matillion Data Loader UI provides an easy process to set up a data pipeline.
A pipeline creation includes the following components:
- CDC Agent (Only for CDC pipeline): The Matillion CDC agent is responsible for managing the CDC tasks in your cloud provider that orchestrates the CDC process. For more information, read CDC agent UI.
- Source configuration: A source can be a database, a SaaS-based application (an API endpoint), or a file storage location that has the data that you want to analyze. Matillion CDC integrates with a variety of sources.
- Table selection: Select the table you would like to extract the data from, and load it to the destination.
- Destination configuration: Destination is a data warehouse or database where the data are extracted from a source. For more information, read Destinations. To understand the setup requirement for data warehouses, read Technical requirements.
- Settings: Apply these to your connection details and the pipeline you're creating.
Before you begin
- An active Matillion Hub account. If you don't have an existing Matillion Hub account, please register for a user account to begin creating pipelines, using the Batch or CDC data loading services.
- Access to the source system (integration) where your data are stored.
- A destination (database or data warehouse) to which the data must be loaded.
Next steps
- For a general understanding of the Matillion Data Loader signing up process, read Signing up for Matillion Data Loader.
- For a detailed understanding of the Matillion Data Loader user interface, read Matillion Data Loader Pipeline UI.
- For a detailed understanding of how the CDC Agent setup UI works, read Agent Setup UI (CDC).