-
DarkLight
Deploying a CDC agent in Azure quick guide
-
DarkLight
Overview
Use this guide to add a CDC agent in Matillion Data Loader and then deploy that agent in Microsoft Azure. Creating and deploying an agent are required steps to set up a CDC pipeline in Matillion Data Loader.
For best performance, your Azure region should be geographically similar to your Matillion Hub account region.
Create a CDC agent in Matillion Data Loader
- Log in to the Matillion Hub.
- The My Accounts lists any accounts you have already created or joined. At the bottom of this list, click Add new account. Read Create an Account to learn more about this topic.
Each Matillion Hub account can generate its own unique platform key that your CDC agent will use to communicate with Matillion Data Loader. With this in mind, create the CDC agent in the account that matches the platform key you will be using.
- Choose Matillion Data Loader as the service on the Select your service page.
- On the Matillion Data Loader dashboard, scroll to the lower-right of the UI and choose your region.
- Select Agents in the left sidebar and click Add agent.
- Give your agent a sensible Agent name and Description. Click Continue.
- Since this guide is for Azure, select Azure as your cloud provider.
- Choose ARM as the service to provision and deploy your cloud resources from for the CDC agent installation.
- In the Prerequisites for agent setup, note the following values:
- ID_ORGANIZATION: This value is used when deploying the CDC agent in AWS. The value is unique per agent.
- ID_AGENT: Also used when deploying the CDC agent. The value is unique per Agent.
- PLATFORM_WEBSOCKET_ENDPOINT: Also used when deploying the Agent. The value is unique for the Matillion Data Loader region (US or EU).
- Public/Private key pair: This is a generated value. If you haven't generated a platform secret for your account yet, Matillion Data Loader will prompt you to do so when creating a CDC pipeline. You need to store this value in Azure Key Vault where your CDC agent can access it. For security reasons, this key pair/platform key can only be generated and shown once per account, so make sure to copy and save it for future use.
- You can revisit this page if required.
- The key pair/platform key can only be generated and shown once.
Azure Prerequisite Steps
Step 1: Create a new resource group in Azure
- Log in to the Azure portal.
- Select Resource groups, and select Add.
- Enter the following values:
- Subscription: Select your Azure subscription.
- Resource group: Enter a new resource group name.
- Region: Select an Azure location. It is advised to keep all resources within the same region, if possible.
- Click Review + Create.
- Click Create. It takes a few seconds to create a resource group.
Step 2: Create a new managed identity in Azure
- From the Azure portal, in the search bar, enter Managed Identities. Under Services, select Managed Identities.
- Select Add, and enter values in the following boxes in the Create User Assigned Managed Identity pane:
- Subscription: Choose the subscription to create the user-assigned managed identity under.
- Resource group: Choose a resource group to create the user-assigned managed identity in, or select Create new to create a new resource group.
- Region: Choose a region to deploy the user-assigned managed identity.
- Name: Enter the name for your user-assigned managed identity.
- Click Review + create to review the changes.
- Click Create.
Step 3: Create a new storage account and container
- From your Azure portal browse to the Storage accounts service.
- Click Create.
- Select a resource group and region. It is recommended to choose the same region as you will be launching your agent (and key vault) in.
- Click Review + create and then click Create.
- After a few seconds, your resource will be created. Click Go to resource when complete.
- When the resource has been created, click Containers.
- Click + Container.
- Leave the public access level as Private.
- Give your new container a name and click Create.
- Click Access keys and then click Show Keys.
- Make note of one of your access keys from a Key field. This storage access key (SAS) will be required while creating CDC pipelines in the Matillion Data Loader UI. Read Blob storage for more information.
MySQL Configuration (optional)
To enable the MySQL Connector for the agent, a prerequisite is to upload the MySQL driver to the storage container. This will be pulled in by the cdc agent if the appropriate MySQL environment variables are configured. To do this, upload directly via the UI to a suitable folder such as mysql-driver
.
Step 4: Create an Azure key vault and store the platform private key in a secret
- Search for Key vaults from the Azure portal.
- Click + Create.
- Select a resource group and region.
- Click Next or click Review + create and then click Create.
- After a brief moment, your key vault will be created and you can click Go to resource.
- Please note your Vault URI as you need this for installing the CDC agent in Azure.
Vault Access Policy
The default permission model selection when creating a Key Vault resource is Vault access policy
. If this is used, any permissions to retrieve secrets will need to be controlled through Access Policies
rather than role assignments.
Azure Role-based Access Control (RBAC)
If the user selects Azure RBAC as the permissions model, they can use role assignments to configure the level of access required to secrets. For the purposes of this guide, we will be using RBAC as our permissions model.
Once you have created the vault resource, add the following secrets:
- Storage Access Key for your storage account (created earlier).
- A key labelled
agent-rsa
if you are using the default private key environment variable name. Otherwise, any name that is appropriate (the chosen name will later need to be set under thePLATFORM_KEY_NAME
environment variable).
You will need access permissions to create secrets within this key vault. You will need to add your current user with either Key Vault Administrator
or Key Vault Contributer
. The permissions are managed through role assignments as detailed earlier.
The agent-rsa key must be created through the Azure CLI—this is due to a current issue with the Azure Console handling new line characters when a private key is pasted directly in the UI.
The command to create a secret to key vault can be found here Quickstart - Set and retrieve a secret from Azure Key Vault - you will need to grab the key and store this in a plaintext file.
An example of this is listed below:
az keyvault secret set --vault-name "<vault name>" --name "agent-rsa" --file <file_name>
Step 5: Adding role assignments
Once the above steps are completed, add role assignments to both the storage account and key vault. This is done by navigating to the respective resource.
- Select Access control (IAM) from the navigation menu.
- Click Add followed by Add role assignment.
- Select the appropriate Role:
- For Key Vault, select Key Vaults Secret User.
- For the Storage Account, select Storage Blob Data Reader.
- Select Next and then choose the managed identity created earlier.
- In the members area this is done by selecting the Managed identities radio button, then selecting select members and choosing the identity from the list.
- Select Next and Review the configuration and then select assign.
Step 5.5: Create a Log Analytics workspace (Optional)
- Navigate to the Log Analytics workspaces resource.
- Click Create.
- Select a resource group and region. It's recommended to choose the same region as you will be launching your agent (and key vault) in.
- Give your new workspace a name and click Review + create, then click Create.
- After a few seconds your resource will be created. Click Go to resource when complete.
- While still in your new Log Analytics workspace resource, click Agents.
- Copy the Workspace ID field and store this value for later.
- Copy the Workspace Key field and store this value for later.
Step 6: Creating an agent container
With the permissions set and the resources created, you can start to put together the agent container instance template.
This step will need to be completed within the Azure CLI:
You can use the template.json to create a new container instance running the cdc-agent
. Any <>
enclosed strings will need to be replaced with actual values. There are of course other properties available as well; however, the below shows the bare minimum to get an agent up and running.
For more information on using ARM templates, please read Azure - ARM Template.
There are a few things to note:
- Any
MYSQL_ environment
variables do not need to be provided, if you are not running the MySQL connector. - The managed identity details can be found on the identity created earlier within this guide
- If you have completed the optional step, you can pass the workspace details to the template—if not, you can omit the Log Analytics section entirely
Once the template has been updated and saved (as a .json file), you can create the new container using the following command:
az container create --resource-group <resource group> --name <container name> --file <file_name>.json
Deploy your CDC agent in Azure
There are many deployment methods for a Matillion Data Loader CDC agent in an Azure environment. The following steps use the process detailed in Azure - ARM Template.
- Download the Azure ARM Template named
template.json
. - Log in to the Azure portal.
- Search for Deploy a custom template.
- Select Build your own template in the editor.
- Select Load file and choose the template file downloaded in step 1.
- The template file will now be loaded in Matillion Data Loader. Click Save.
- In Custom Deployment, complete the template metadata parameters:
- Subscription: Select the desired subscription from the drop-down menu. This field is likely to default to your preferred subscription. Please note, all resources in an Azure subscription will be billed together.
- Resource Group: Assign a resource group to the selected subscription, or create a new one for your instance. A resource group is a collection of resources that share the same permissions and policies.
- Region: Where to deploy your CDC agent. This should be the same region (either US or EU) that you chose when you created the CDC agent in Matillion Data Loader.
- Agent Id: This is the
ID_AGENT
value you copied from the Prerequisites for agent setup dialog—step 9 of Create a CDC Agent in Matillion Data Loader. - Azure Client Id: Client Id of the Azure Active Directory service principle used for authentication.
- Azure Client Secret: Client secret value of the Azure Active Directory service principle used for authentication.
- Azure Key Vault URL: The URL of the key vault where your private key is stored.
- Azure Tenant Id: Tenant Id of your Azure Active Directory.
- Container DNS Name: This is the DNS prefix the container will be available at.
- Container Group Name: Provided by the uploaded template.
- Container Name: The name of the Azure Blob Storage container to be used as a target (you can enter the Container Name that you created earlier). This can be entered into the template file. It is advised to create a new container specifically for this purpose.
- Image URL: The repository of the Matillion Data Loader CDC Agent. The template should autofill this value.
- Log Analytics Workspace Id: Log Analytics Workspace Id is to facilitate log collection and retention.
- Log Analytics Workspace Key: Log Analytics Workspace Key is to provide the full credentials for access.
- Organization Id: This is the
ID_ORGANIZATION
value you copied from the Prerequisites for agent setup dialog—step 9 of Create a CDC Agent in Matillion Data Loader. - Platform Key Name: The name of the Azure key vault secret in which your generated private key is stored.
- Platform Websocket Endpoint: The
PLATFORM_WEBSOCKET_ENDPOINT
value, Prerequisites for agent setup dialog—step 9 of Create a CDC Agent in Matillion Data Loader.
- Click Next and follow the instructions on screen.
In Matillion Data Loader, your created CDC agent's status should display as Connected and offer the Add Pipeline button.