Variables

Variables


Overview

Variables are name-value pairs stored within each environment. Variables can be used in all sorts of parameters and expressions to allow the user to pass and centralise environment specific configuration. They are also easily accessible through Python and Bash scripts where they are substituted in at run time.

Matillion ETL supports Environment Variables, Automatic Variables, Job Variables and Grid Variables. For detailed information on each type, please visit the below links. For more general information on variables in Matillion ETL, read on.

Warning

Variables become first-class variables in Python and Bash scripts. So, great care should be taken to avoid naming them in a manner that clashes with key words in either language. It is recommended a prefix (for example, v_) be used to ensure no such conflicts occur.

Variables can be referenced with the syntax: ${<variable name>}

When a job is run, variables are resolved by first consulting job variables, then environment variables, then system variables. Thus, if a job variable and environment variable of the same name exists, the job variable will be preferentially used.

Important Information

  • All alpha-numberic characters, as well as underscores, are valid for use in variable names (for example, ${my_table_82} and even Javascript expressions such as ${new Date().getFullYear()})
  • However, a variable name cannot begin with a digit (for example, ${82_my_table} is not a valid variable name)

Behaviour (Copied and Shared)

Throughout this article, variables may be referred to as Copied and Shared. This refers to their "branch behaviour" inside a job. A "branch" in this context means a divergence of connectors within a job, giving a branched structure. Iterator components are also examples of branching, albeit with a different aesthetic. The branch behaviour describes how a variable behaves when updated during a job run:

  • Copied variables can be updated within one branch of a job without updating variables of the same name in other branches. A branch using a copied variable will begin by taking its default value, even if this variable has been updated in another, parallel branch.
  • Conversely, Shared variables are updated in a job-wide fashion. As such, regardless of branches, if a Shared variable is updated in one branch, all other branches will use that updated value.

Please Note

In previous versions of Matillion ETL, Copied and Shared variables have been referred to as Local and Global, respectively, and can be thought of synonymously when reading the documentation.

Manage Environment Variables


Types

​The value of any given variable must be one of the following types. This type defines how a variable can be used and is important to consider when passing the variable to components and functions. In the Matillion ETL client, types are often referred to by the symbols shown below:

Symbol Type Description
Text Any text string
Numeric Any number, with or without decimals (Matillion ETL for Redshift also allows Real and Double Precision types in some features)
DateTime Date and time values in the following formats yyyy-MM-dd, yyyy-MM-dd HH:mm, yyyy-MM-dd HH:mm:SS, or yyyy-MM-dd HH:mm:SS.SSS

Text Mode

On many menus, such as Manage Job Variables and Manage Environment Variables, users can switch to a Text Mode by ticking the checkbox next to Text Mode. Switching to Text Mode allows users to:

  • Edit variables as a set of space-delimited text entries
  • Convert all existing variable data into an appropriate format
  • Copy and paste text, offering a fast way to move large numbers of variables
  • Have certain types of missing data auto-completed by Matillion ETL when parsing
  • Error check data before adding it to the project (Matillion ETL parses the data on exiting Text Mode, throwing a message if an error has been made)

Text Mode


Filtering Booleans

It is possible to use boolean statements to filter values. In the example below, you can type True or False for the value combined with AND and OR conditional statements. You can also use 1 and 0 for the values, which work as True or False values, respectively.

Filtering Booleans


Example 1

As a basic example, a defined variable can be used to perform a simple calculation in a Transformation Job. The job consists of a Generate Sequence component linked to a Calculator component. The Generate Sequence component can be used to create a column of numbers that are passed to the Calculator.

Calculating columns from a general sequence

  1. Click the Generate Sequence component icon. Then, in the Properties panel, set the Starting Value, Increment Value and End Value to "1", "1" and "10", respectively. This will create a column of data that counts from 1 to 10.

    Setting up the General Sequence component

  2. Next, click on the Calculator component icon. Enter the Calculation Editor and add a new calculation. This will allow values in the sequence row (output by the Generate Sequence component) to be multiplied, and then multiply that value by a number. In this case, each value in the sequence column is multipled by 4 using the following calculation:

    "sequence"*4

    Sample output

  3. The output of the Calculator can be viewed by clicking the Sample tab from within the component's Properties panel, then clicking ⟳ Data. It will show the sequence column is being multiplied by the number specified in the calculation.

    Sample output

  4. A variable in this calculation could also have been used instead. First, a variable must be declared via ProjectManage Environment Variables. A new variable can be declared here and given a type, value and scope. Since the variable for this example will be used in a simple calculation, the Numeric type is appropriate.

    Please note

    The scope is of little consequence for this job and can be set to Shared.

  5. The default value can then be set to any number—6 is chosen in this example.

    Edit Environment Variables

  6. Finally, the Calculator component must be instructed to use this variable in its calculation. Reentering the Calculation Editor, the constant multiplier can be replaced with the newly-declared variable using the following calculation:

    "sequence" * ${example_var}
  7. Checking the sample output confirms that the Calculator component has correctly used the variable's default value in the calculation.

    Checking Sample output


  8.  

    Example 2

    In this example, two environments have been created, namely Test and Live. A common use case is to work on a more limited dataset in Test while having a full dataset in Live. A "starting_year" variable will define how much data is to be used in the Test and Live environments when applied in a filter.

    1. Variables must be predefined, with a default value for each environment. It is not necessarily true that two environments have different default values. However, in this example, the variable "starting_year" will be set to an earlier value for use with the Live environment. In the Test environment, less data is required and can thus start collecting it from a much later year.

    2. Now that a variable has been defined, it can normally be used in all expression editors and free text fields. In this example, the variable is used by a Filter component to limit the dataset of a Table Input component.

    3. The Filter component is setup to filter any table rows with values greater than ${starting_year) in the "year" input column. Due to how default values were defined for "starting_year", if this job is run in the Test environment, only flights from 2016 onward will remain. In the Live environment, flights from 2006 onward will remain.

      Please Note

      JavaScript expressions are only available to use as parameter values within Matillion ETL components. Any valid single JavaScript expression may be used; however, it is recommended that only simple expressions are used. All variables defined in the job and / or environment should also be available to reference.

      Error

      When entering values into a component's parameter editor, everything enclosed within ${ } of the literal string will be evaluated immediately. This validation process currently does not take variables into account and may assume the value is incorrect.

    4. The SQL generated by this component and its settings can be viewed by clicking the SQL tab. The example below shows the SQL for the Live environment.


    5. Instead of relying on variables with default values, a Javascript expression can also be used. Javascript expressions do not require defining any variables in the environment. An example for the Live environment is given below, using the Javascript expression ${newDate()).getFullYear()-10}.

    6. This Javascript expression will generate the following SQL, given the current year generated from the Javascript Date() function is 2016.