Jobs

Jobs


Overview

Jobs are Matillion ETL's main way of designing, organising and executing workflows. The most common usage of Matillion ETL is to build strings of configured components inside a job and then run that job to accomplish a desired task such as loading or transforming data. The area in which components are laid out within a job is called the canvas.

Important Note: Although most components are common to all platforms, some are not and are only available for certain versions Matillion ETL. For example, Matillion ETL for Redshift may have some Redshift-specific components that Matillion ETL for BigQuery does not and vice-versa.

There are two main flavours of jobs in Matillion ETL: Orchestration and Transformation.

  • Orchestration is primarily concerned with DDL statements (especially Creating, Dropping and Altering resources), loading data from external sources
  • Transformation is used for transforming data that already exists within tables. This includes filtering data, changing data types and removing rows.

Shared Jobs are simply packaged jobs created by users that can be Orchestration and/or Transformation and are used similar to how a component is used.


Orchestration Jobs

Orchestration Jobs deal with the management of resources (such as tables) as well and loading data from external sources. This typically makes Orchestration Jobs a user's first real use of Matillion ETL as data must be loaded into a table before being transformed. An orchestration job is chiefly defined by the components it contains - full documentation for which can be found in the 'Orchestration' category.

Data can be loaded using Connectors (analogously: data stagers, integrations, query components). For more information, see Connector Components for more information.


Transformation Jobs

Transformation Jobs are, predictably, concerned with transforming data within tables. This generally comes in the form of components that are named after the functions and DML commands that they represent such as Rank and Aggregate. A Transformation Job is chiefly defined by the components it contains - full documentation for which can be found in the 'Transformation' category.

Transformation Jobs have no specific 'Start' point unlike Orchestration Jobs and many flows can be specified to run at once simply by creating multiple strings of components. Users may want to consider Job Concurrency when making such jobs.


Shared Jobs

Shared Jobs are packaged jobs, created by users, that can be used similarly to how a component is used. This can be conceptually understood as a user-created component, the complexity of which could be as simple as a single Python Script component or as complex as an entire ETL workflow.

Smart use of a Shared Jobs allows users to decrease the complexity of their workspace where repeatable, complex workflows are packaged into a single Shared Job which is then used in other workflows. In many ways this is a neater, more configurable, more portable version of linking to other jobs using the Run Orchestration and Run Transformation components.

For more information, see Shared Jobs