Tasks are created whenever an orchestration or transformation operation is performed by Matillion ETL. This includes:
Tasks created by user operations. This includes almost all operations that the user performs that generate database queries, such as running a job, retrieving a sample, or a row count.
Tasks created by the scheduler.
Tasks run via the Matillion ETL API. Such tasks will display as "API" in the Tasks tab.
For Redshift users, tasks created from Amazon SQS (Simple Queue Service).
As an orchestration or transformation job is executed, the main "Run" task is broken down into more granular tasks as the execution continues. For orchestration jobs the individual tasks are created “on-the-fly” as decision points are reached in the Orchestration such as If, Loop, And, and Or.
Task information can be viewed in several ways. Task information is immediately available in the Task panel at the bottom-right, giving concise information on recently run tasks and how they have progressed.
Clicking the icon next to a job displays further task information in a new tab.
A fuller report on all tasks logged by the instance can be found in the Task History.
Scheduled or queued Matillion ETL jobs will only run if the Matillion ETL instance is running.
Internally, Matillion ETL uses queues to manage tasks. There is one queue per environment. If there is a long-running task, subsequent tasks will be queued behind it. This will show in the task panel with a "waiting" icon, .
Matillion ETL behaves like this because it's usually most efficient, on both loads and transformations, to let the parallel database engine manage the concurrency. If this behaviour is undesirable, it can be worked around by using multiple environments, however this isn't recommended.
In addition, tasks initiated by the scheduler and the SQS queue listener will also queue behind long-running tasks in the same environment.
For these reasons, we strongly recommend that you set up multiple environments to separate your development and production work so the task queues for your production environment can't be affected by your development work. These separate environments can connect to the same cluster and database, however we strongly recommend that they point to different default schemas. For more information on environments, read Environments.
The Task panel at the bottom-right shows the last 20 tasks completed, running, queued, cancelled, or failed since joining the current session. All tasks can be expanded to show sub tasks.
In addition to job information, the Task panel includes:
- Any output from a Python script component.
- Status information from some orchestration components such as the RDS Query.
As orchestration tasks are calculated on the fly, not all future tasks may be listed, however they will be appended to the list as tasks complete.
- A red x icon, , indicates a task has failed to complete.
- A green tick icon, , indicates a task has completed successfully.
- A rotating disc icon, , indicates a task currently in progress.
- An hourglass icon, , indicates a task that has been queued and will execute when a free thread is available.
To cancel a task, right-click on its entry in the Tasks panel and then click Cancel.
You can also cancel tasks using the Matillion ETL API, if you know the task ID.
When a task is cancelled, all queued sub-tasks are also cancelled including any remaining loop iterations.
In the case where a task is stuck and won't cancel, hold CTRL and right-click the task to reveal a Cancel Task and Continue option. This option will effectively force the cancellation of a task in Matillion ETL by removing the stuck thread (to ideally resolve itself), allowing the queue to continue.
This doesn't free up the thread currently occupied by the stuck task, and repeatedly doing this may mean your instance runs out of available threads.
Completed tasks, whether successful or failed, are shown in the task history along with all their detail. Click Project → Task History to open the Task History tab. This displays details of each task that has been run or scheduled in this project.
Hovering the mouse over a column heading reveals a drop-down menu with two options:
- Columns allows you to add or remove columns from the task history display.
- Filters allows you to filter the history according to the column you have selected. Filter options depend on the column you select, for example "Started" lets you filter jobs that started before, after, or on a particular date, while "Task Type" lets you select from a list of possible task types.
The final column (with no name) displays the status icon for the task: completed, failed, or cancelled. This column can be filtered on those criteria, for example to show all failed jobs. This can be particularly useful if you have queued many jobs and come back later to find some have failed and want more information about those tasks.
Importing task information via API
Additional task details
In the Task panel or the Task History panel, click the icon next to a job to display further task information in a new tab. This is particularly useful for surveying complex jobs with many components.
The job can be expanded to show its constituent components and a summary of how they performed in the task. The success and failure of each job and component is given by green ticks or red crosses, respectively.
A recorded run time for each component and job is also given, and failed components will have an error message returned that can be expanded using the ellipsis button beside the Message field.