Configuring Matillion ETL to use a Proxy

Configuring Matillion ETL to use a Proxy


Overview

This topic explains how to configure Matillion ETL to use a proxy server.

It is a known method to implement proxy servers to provide an additional layer of security, or to act as an intermediary between your servers and the internet. Depending on a scenario, proxy servers may help with URL and web content filtering, IDS/IPS, data loss prevention, monitoring, and advanced threat protection.

Matillion ETL is a Java application and is hosted on an Apache Tomcat application server. To configure Tomcat to use a proxy server for http/https communication (which Matillion ETL inherits), follow these instructions:

Please Note

  • Besides Matillion ETL, there are other applications that do not depend on Tomcat yet use the proxy server configuration of the underlying Linux operating system. Examples of these include: any system processes/services that run in the background, AWS CLI, Matillion ETL's Bash component, or Python scripts run using the Python 2/3 interpreters in Matillion ETL. Please refer to the below headed section "Configure Proxy for Linux" to configure proxy server redirecton for the Linux operating system.
  • This guide presumes that your proxy is already configured and is reachable from the Matillion instance, and that any ports used for communication are open on the respective security groups.



Configuring Proxy for Tomcat

  1. SSH to the Matillion ETL instance.
  2. Edit the following file - /etc/sysconfig/tomcat8
  3. Add the following line at the end of the file:
JAVA_OPTS=" -Dhttp.proxyHost=<<proxy host ip/name>> -Dhttp.proxyPort=<<port number>>"

 

Use the following to configure https proxy.

JAVA_OPTS="$JAVA_OPTS -Dhttps.proxyHost=<<proxy host ip/name>> -Dhttps.proxyPort=<<port number>>"


  1. Save and close the file.
  2. Restart Tomcat using the following command: sudo service tomcat8 restart

Optionally, you may add the following line at the end to instruct the JVM to bypass proxy for loopback and instance metadata.

JAVA_OPTS=" -D http.nonProxyHosts=localhost|127.*|169.254.169.254"





Configuring Proxy for Linux

The following instructions configure some environment variables on Linux to point at the proxy server to use. This is useful for Bash scripts or any Python scripts run from Matillion ETL that may run in their own process space.

  1. SSH to the Matillion ETL instance.
  2. Edit the following file - /etc/profile
  3. Add the following line at the end of the file:
export http_proxy=http://<proxy host ip/name>>:<<port number>>/

Use the following to configure https proxy.

export https_proxy=https://<proxy host ip/name>>:<<port number>>/

Some software may require uppercase environment variables, so it is safe to set HTTP_PROXY or HTTPS_PROXY as well.