Launching Matillion ETL for Delta Lake on AWS
This is a guide to launching an Matillion ETL instance using Delta Lake on AWS.
- This document is part of the Matillion ETL Instance Creation process.
- Matillion ETL uses Databricks Partner Connect to simplify the process of connecting to an existing SQL endpoint or cluster in your Databricks workspace. For full details of using Partner Connect, read Connect to Matillion by using Partner Connect in the Databricks documentation.
To launch and configure Matillion ETL for Delta Lake on AWS using Partner Connect, use the following steps.
Click Continue in AWS and the Amazon EC2 Console opens. The system will have pre-populated an Amazon Machine Image (AMI) with all pre-configured details required to launch the instance. Select the AMI from the list presented, then click Launch instance from AMI at the top-right of the page.
If choosing from a list of AMIs, use only AMIs with "billing" in the instance name and not "byol". A "billing" filter has been pre-set on the list to assist you here. Also note that images aren't necessarily listed in order of recency and care should be taken when selecting the desired version.
- The Launch an instance page opens. This page will be pre-populated with default settings for the instance. The only item you must select here is a valid Key pair, for everything else you can accept the defaults. If you wish to change any default settings before launching, please take the following points into account.
- Number of instances: Leave as default (1), unless you want to launch multiple instances.
- Name and Tags: Click Add additional tags to add any instance tags you require.
- Application and OS Images (Amazon Machine Image): This has been pre-selected, so do not change anything here.
- Instance Type: You can leave this as default, or click the drop-down arrow to choose another supported instance type from the list.
- Key pair (login): Select a valid key pair from the list.
- Network settings: VPC: The zone in which this VPC is located should ideally be the same as your Databricks account (European Databricks accounts should be paired with an EU-based AWS region, for example).
- Network settings: Auto-assign Public IP: This depends on the setup. By default a new VPC won't have VPN connections or NAT Gateways available, so in order for Matillion ETL to connect to the internet and for you to access Matillion ETL this will normally need to be set to Enable.
- Configure storage: Accept the default root volume size.
- Advanced details: Request spot instances: Do not select this option.
- Advanced details: Shutdown behavior: Select Stop.
- Advanced details: Termination protection: Select Enable.
4. Once you have selected a Key pair (the only mandatory item on this page), click Launch instance.
- It may take a few minutes until your launched instances are in a running state.
- Click View Instances to monitor the instance status. Once your instances are in a running state, you can connect to them from the Instances page.
- Log into Matillion ETL with the username ec2-user and the instance ID i-xxxxxxxx (for example i-88ed92c6).