Matillion ETL security best practices
Matillion ETL security best practices
This page describes methods for enhancing the security of your Matillion ETL instances. Specifically, this page focuses on three security principles: confidentiality, integrity, and availability.
If you encounter any problems with the methods described on this page, please contact our support team.
Access to data should only be possible after authentication, and should be subject to some level of authorization.
- Protect your running Matillion ETL instances with a firewall that allows the least privileges necessary. Matillion has one web service that checks whether your instance is available to the world. This situation usually indicates an overly permissive firewall, and you'll see a warning: "Your copy of Matillion ETL is publicly available"– in the notices window.
- We recommend that you configure Matillion ETL with the minimum environment connection permissions possible. Please be aware that cloud platform (AWS, Azure, and GCP) metadata endpoints are always accessible, and may allow users to retrieve privileged information—including SSO keys.
- Use HTTPS rather than HTTP. Matillion ships with a self-signed SSL certificate that is perfectly functional, but that causes your browser to issue an "un-trusted" warning. You can upload your own certificate and associated private key if you wish.
- Enable SSL for JDBC communications between Matillion and all the data sources and targets. This is sometimes a JDBC connection option, which you can set using
component parameters. It's sometimes a property of the source database, which needs to be configured by the database administrator. For Snowflake and Amazon Redshift, there's an "Enable SSL" option in the Matillion ETL environment.
- When using components that output to permanent cloud storage, choose the option to enable encryption at rest.
- Take advantage of the authentication and authorization options of the target database, especially using a strong username/password combination.
- Don't set up an environment using a powerful "administrator" user. Instead, use a minimally privileged ordinary database user.
- Use the Matillion password manager to store passwords. Choose the option to use KMS encryption rather than the default encoding (which is obfuscation).
- Keep password access to Matillion ETL enabled. You can use a local database, or can switch to an existing LDAP server.
- Use project ACLs to enable or disable access to Matillion users.
- Matillion ETL has an authorization model: don't use generic names, and grant minimum privileges to every individual user.
- Use "top-level" data acquisition components where possible, i.e. those that have their own dedicated orchestration component. These almost all link into Matillion's OAuth credentials management system, which allows you to secure connectivity using OAuth.
Data should be protected from incorrect modification.
- Don't change data during load. This will ensure that the data transformation jobs are working from an accurate copy of the source data.
- Use Matillion ETL's documentation feature to document and publish the ELT designs.
- Implement a testing process that involves testing all orchestration and transformation jobs on representative (and ideally full-volume) source data.
- Have deployment procedures and a version control methodology.
- Use the Matillion ETL audit trail feature to monitor changes to the jobs.
- Take advantage of database transactions. This can help ensure that multipart data transformations are either completely successful, or else fail completely without leaving data partially updated.
Ensure that information can be accessed when appropriate and required.
- Control access to the transformed data by using a different database user than the one used by Matillion ETL. This will help ensure that reports or analytics don't get accidentally run against the wrong dataset, or against data that hasn't yet been fully transformed.
- Use one of Matillion ETL's three backup methods: export/import, API export, and root volume backups (AWS only).
- Have disaster recovery procedures and test them.
- Keep software up to date - monitor the Notices panel for "updates available" messages.