Environments

Environments


Overview

An Environment in Matillion ETL describes how your instance connects to a single connection to a database as well as which sets of cloud platform credentials to use. Numerous Environments can be set up at a user-level, meaning different users can use different Environments on a single Matillion ETL instance. When a user runs a job, the job runs within the Environment currently in use.

The Create/Edit Environment screens look slightly different depending on your Matillion ETL version as it depends on the native cloud platform and data warehouse. Thus, screenshots here are examples but full details are listed for all possible Matillion ETL products.



Creating and Editing Environments

Before starting any job with Matillion ETL, the user must have set up an Environment. To begin, click the Project button in the top left of the Matillion ETL window, and then click Add Environment.

Alternatively, in the lower left corner of the Matillion ETL window, right-click on one of your Environments to navigate to the Add Environment button.

The Environment configuration window has 3 parts (except for Matillion ETL for BigQuery, which has a single screen):

  • Cloud Platform Connection: Specific cloud platform credentials used by this environment.
  • Data Warehouse Connection: Details that allow Matillion ETL to use your data warehouse (Redshift, Snowflake, BigQuery or Synapse).
  • Data Warehouse Defaults: Default resources for Matillion ETL to use such as buckets, schemas and databases.

Each part is described in full in the following sections



Cloud Platform Connection

This section covers which cloud platform credentials your Matillion ETL instance will use. By default, Matillion ETL will attempt to use the instance credentials to discover available resources. However, users can also define their own credentials from within the Matillion ETL client - see Manage Credentials for more information.

  • Environment Name: Describes the Environment. e.g. Test/Live etc
  • AWS Credentials: Choose a set of AWS Credentials Credentials to use with this Environment.
  • GCP Credentials: Choose a set of GCP Credentials to use with this Environment.
  • Azure Credentials: Choose a set of Azure Credentials to use with this Environment.



Data Warehouse Connection

In this section, users must add connection and authorisation details for their data warehouse. An environment can only be configured to a single data warehouse and the type of data warehouse is limited by your Matillion ETL version (e.g. Matillion ETL for Snowflake can only use Snowflake as a data warehouse).

Redshift Connection

  • Redshift Cluster: If you have created an IAM Role, this box will list your Redshift clusters that are visible to that role in the same region as the Instance is running. Selecting that cluster will populate many of the connection settings fields on the screen with sensible defaults. If the box is empty these fields will need to be set manually.
  • Endpoint: The physical address of the Leader Node. It will be either a name or an I.P. address.
  • Port:  Usually 5439 or 5432, but can be configured differently when setting up your Redshift cluster.
  • Username: Username for the Environment connection.
  • Password: Password for the Environment connection. Users have the option to store their password inside the component; however, we highly recommend using the Password Manager feature instead.
  • Enable SSL: Do this to encrypt communications between Matillion and Redshift. Some Redshift clusters may be configured to require this.



Snowflake Connection

  • Account: The name of your Snowflake account. More information about gathering Snowflake account details can be found here.
  • Username: Username for the Environment connection.
  • Password Type: Choose between using a password or a Private Key to authenticate your Snowflake connection.
  • Password: Password or Private Key (as chosen in Password Type) for the Environment connection. Users have the option to store their password or private key inside the component; however, we highly recommend using the Password Manager feature instead.



BigQuery Connection

BigQuery requires no connection details as it is used innately based on your instance credentials.



Synapse Connection

  • Endpoint: The Server Name for your Microsoft SQL Database
  • Port: The port number of your Microsoft SQL Database (1433 by default)
  • Username: A username used to access your Microsoft SQL Database
  • Password: The corresponding password for the above username.
  • Connection Settings: Specify JDBC Connection Properties. See the Microsoft documentation for more information.



Data Warehouse Defaults

In this section, we specify which resources from your data warehouse Matillion ETL should use by default:

Redshift Defaults

  • Database:  However you named it when setting up your Redshift cluster. You may run with multiple database names, in which case, choose the one you would like to use for this project here.
  • Default Schema: 'public' by default, but if you have configured multiple schemas within your Redshift database, specify the schema you would like to use here.
  • Default Bucket: Specific the default S3 bucket to use in many Matillion ETL components and features. This does not prevent a bucket being specified in such cases.
  • Concurrent Connections: (Enterprise Mode only.) State the maximum number of database connections to allow concurrently. This can potentially speed up large ETL jobs. The maximum number of connections possible is equal to the number of threads the instance has at its disposal. See our short Video Tutorial for more details.



Snowflake Defaults

  • Default Role: Select a Role. For more information, please consult the Snowflake documentation.
  • Default Warehouse  Select the Environment's Default Warehouse. More...
  • Default Database:  Select the Environment's Default Database. More...
  • Default Schema:  Select the Environment's Default Schema.



BigQuery Defaults

  • Default Project: Select the Environment's Default Project.
  • Default Dataset: Select the Environment's Default Dataset.



Synapse Defaults

  • Default Database: Select the Environment's Default Project.
  • Default Schema: Select the Environment's Default Schema.
  • Concurrent Connections: (Enterprise Mode only.) State the maximum number of database connections to allow concurrently. This can potentially speed up large ETL jobs. The maximum number of connections possible is equal to the number of threads the instance has at its disposal. See our short Video Tutorial for more details.