Configuring Matillion ETL to use a Proxy

Configuring Matillion ETL to use a Proxy


Overview


In this article we will look info configuring Matillion ETL to use a Proxy.

It is not uncommon to implement proxy-servers to provide an additional layer of security or to act as an intermediary between your servers and the internet. Depending on the scenario, Proxy servers may help with URL and web content filtering, IDS/IPS, data loss prevention, monitoring and advanced threat protection - amongst others.


Matillion ETL is a java application and is hosted on an Apache Tomcat application server. The following instructions configure Tomcat to use a proxy server for http/https communication and Matillion ETL inherits these.

Besides Matillion ETL, there are other applications which do not depend on tomcat and use the proxy configuration of the underlying linux operating system instead. Examples of these are any system processes/services that run in the background, AWS CLI, the bash component in Matillion ETL or Python scripts run using the Python 2/3 interpreters in Matillion ETL. Refer to the Configure Proxy for Linux section to configure proxy redirection for the OS.

This document expects your proxy is already configured and is reachable from the Matillion instance and any ports used for communication are open on the respective Security Groups.

 

Configuring Proxy for Tomcat

  1. SSH to the Matillion ETL instance
  2. Edit the following file - /etc/sysconfig/tomcat8
  3. Add the following line at the end of the file
     
JAVA_OPTS="$JAVA_OPTS -Dhttp.proxyHost=<<proxy host ip/name>> -Dhttp.proxyPort=<<port number>>"

Use the following to configure https proxy.

JAVA_OPTS="$JAVA_OPTS -Dhttps.proxyHost=<<proxy host ip/name>> -Dhttps.proxyPort=<<port number>>"

  1. Save and close the file
  2. Restart Tomcat - sudo service tomcat8 restart

Optionally you may add the following line at the end to instruct the JVM to bypass proxy for loopback and instance metadata.

JAVA_OPTS="$JAVA_OPTS -D http.nonProxyHosts=localhost|127.*|169.254.169.254"

 

Configuring Proxy for Linux

The following instructions configure some Environment Variables on linux to point at the proxy server to use. This is useful for Bash scripts or any Python scripts run from Matillion ETL which may run in their own process space.

  1. SSH to the Matillion ETL instance
  2. Edit the following file - /etc/profile
  3. Add the following line at the end of the file:
export http_proxy=http://<proxy host ip/name>>:<<port number>>/


Use the following to configure https proxy.

export https_proxy=https://<proxy host ip/name>>:<<port number>>/


Some softwares may require/expect uppercase environment variables so it’s safe to set  HTTP_PROXY or HTTPS_PROXY as well.