Technical Requirements
  • Dark
    Light
  • PDF

Technical Requirements

  • Dark
    Light
  • PDF

Overview

This article outlines the current technical requirements and limitations for Matillion Data Loader.

Below are the technical requirements for Amazon Redshift, Snowflake, Google BigQuery, Amazon S3 and Azure Blob.

Amazon Redshift

  • Matillion Data Loader does not yet support SSH tunneling or PrivateLink, so you must either have a publicly accessible Redshift cluster or set up an SSH host that is publicly accessible and can forward traffic to the VPC Redshift is running inside.
  • Our recommendation would be to use a separate Amazon Redshift cluster (a single dc2.large node should be sufficient) for the purposes of testing; Matillion will reimburse reasonable charges incurred on submission of an AWS bill detailing the cluster used. Please don’t test on 50 8XL nodes!
  • An AWS access and secret key relating to an existing IAM role that can read/write to S3. S3 is used as a staging area and although no objects will be left behind permanently, we need to read/write to S3 objects temporarily during processing.
  • Amazon Redshift Username & Password for the Redshift instance used during testing.
  • Authentication for any 3rd party data sources. On configuring the data source, you will be prompted to grant Matillion access to your data source and you are free to choose which account you use during that authorisation. This could be:
    • Usernames/Passwords for JDBC-accessible databases
    • OAuth for most others.

Snowflake (AWS or Azure)

  • Matillion Data Loader does not yet support SSH tunneling or PrivateLink, so you must either have a publicly accessible Snowflake account or set up an SSH host that is publicly accessible and can forward traffic to the VPC that has the PrivateLink to Snowflake setup. Our recommendation would be to use a publicly available Snowflake account if available.
  • Snowflake Username & Password for the Snowflake instance used during testing.
  • Authentication for any 3rd party data sources. On configuring the data source, you will be prompted to grant Matillion access to your data source and you are free to choose which account you use during that authorisation. This could be:
    • Usernames/Passwords for JDBC-accessible databases
    • OAuth for most others.

Google BigQuery

  • Matillion Data Loader requires a Google Service Account which is configured to be able to use Google BigQuery and Google Cloud Storage. Google Cloud Storage (GCS) is used as a staging area and, although no objects will be left behind permanently, we need to read/write to GCS objects temporarily during processing.
  • Authentication for any 3rd party data sources. On configuring the data source, you will be prompted to grant Matillion access to your data source and you are free to choose which account you use during that authorisation. This could be:
    • Usernames/Passwords for JDBC-accessible databases
    • OAuth for most others.

Amazon S3

  • An Amazon Web Services (AWS) account. Signing up is free - click here or go to https://aws.amazon.com to create an account if you don’t have one already.
  • Amazon Username & Password for the Amazon instance used during testing.
  • Permissions to create and manage S3 buckets in AWS. Your AWS user must be able to create a bucket (if one doesn’t already exist), add/modify bucket policies, and upload files to the bucket.
  • The IAM role used by the Agent container has putObject permissions for the S3 bucket and its prefix to be used as the destination by the pipeline.
  • An up and running Amazon S3 bucket.

Azure Blob

  • Your destination should be an Azure Storage account that supports containers, such as BlobStorage, Storage, or StorageV2.
  • At the minimum, the role Reader & Data Access is required for sufficient permissions. The role should be applicable for the Azure Storage account in which your destination container is located.
  • The destination container needs to use an access key for authentication.
  • The agent container needs to use a Shared Key injected as an Environment Variable for authentication to the storage container.
  • If your storage account only allows access from selected networks, IP allowlisting is needed.

Warning

Matillion will continue to add and update more descriptive information within these sections, to continually help you self investigate, diagnose and fix any issues or reach Matillion Support at your earliest opportunity.