Authentication with API Profiles

Authentication with API Profiles


Overview

Building from Manage API Profiles, this guide uses practical examples to demonstrate various methods to add authentication to an existing API Profile in Matillion ETL.

Important Information

  • Most APIs require some form of authentication to function; for example, an API key or a username and password.
  • Every API Profile in Matillion ETL contains one or more RSD "files" required to convert the hierarchical XML or JSON data from an API endpoint into a rectangular format that can be used by target platforms.
  • Authentication for an API Profile is stored within one of these RSD files.
  • All API Profiles within Matillion ETL can be run using an API Query component in an Orchestration Job.

Warning

  • The RSD syntax used by Matillion ETL has changed significantly from version 1.41. For the latest information, please refer to Paging with the REST API Query Component 2020.
  • Due to these changes, the syntax of examples used in this document and RSD files on Matillion ETL may differ slightly from current API standards—for example, <rsb: is now <api:.
  • Other updates include the need for users to setup input parameters as inputs rather than dummy columns, and only returning the date on Date data types as opposed to the datetime as previously.

Method 1: Adding HTTP Basic authentication using Connection Options

Authentication can be added directly to an API Profile by editing its RSD file. However, Matillion ETL also allows HTTP Basic authentication to be added via a component using the Connection Options input in the component's Properties panel.

In this example, Matillion's internal REST API – Matillion API – is used, as it is JSON-based and requires HTTP Basic authentication (a common design pattern). This built-in API Profile is pre-configured and can be found in the Manage API Profiles window, and contains two RSD files; namely, Run History Details.rsd and Run History Summary.rsd.

  1. Within the Properties panel of an API Query component, provide details for the following inputs:

    • Profile – select a functioning, pre-defined API Profile from the list provided, in this example select Matillion API
    • Data Source – select a Data Source to be used, in this example select Run History Summary
    • Data Selection – select the data column to used from the Data Source, in this example select id
    • S3 Staging Area – select a S3 Staging Area for the data
    • Target Table – provide a name for the table where the data will be staged, then run the job.

    Running Matillion API with API Query component

    Running Matillion API with API Query component

  2. The job will fail with an authentication error. This is due to no authentication being provided.

    Please Note

    The built-in RSD files within the Matillion API Profile don't contain authentication statements at all, as authentication is based on the user's credentials.

    Authentication error

    Authentication error

  3. For this example, in the Connection Options pop-up window, click + to add following paramaters, then provide values:

    • Add AuthScheme, then enter "Basic" as the Value
    • Add User, then enter the name of a user configured on this Matillion ETL instance as the Value
    • Add Password, then enter the password associated with this above user as the Value, then click OK

    Please Note

    Behind the scenes, the API Query component's AuthScheme connection option parameter works by setting a special HTTP header on your behalf. In the next example, this will be done manually by making editing an API Profile's RSD file.

    Connection Options

    Connection Options


Method 2: Encoding HTTP Basic authentication

Authentication can also be encoded via a scripting tool, then the resulting text string can be entered as an HTTP request header within an RSD file.

In this example, Dotmailer is used as it provides a JSON-based REST API which allows many of the entities, such as "campaigns" and "contacts", to be queried. The API is documented here. The "get all campaigns" web service (which returns all known campaigns in batches of up to 500) will be used.

  1. In the Manage API Profiles window, and click + to set up a new API Profile named "Dotmailer".

  2. Once the new API Profile has been created, click next to the new profile to open the Configure API Profile pop-up window.

  3. Then, in the Files section, click + to add an RSD file. Provide details for the following fields:

    • Filename – enter "Campaigns" as the name of the RSD file
    • URI – enter "https://r1-api.dotmailer.com/v2/campaigns" from the Dotmailer API documentation (Refer to DotmailerCampaigns.rsd for the exact syntax)
    • Repeat Element – enter the document root "/", then click OK

    Please Note

    Depending on the current Matillion ETL setting, an RSD file may be outputting with declarations beginning with <rsb:set or <api:set. Both are acceptable and will function the same.

    Add RSD file to Dotmailer API Profile

    Add RSD file to Dotmailer API Profile

  4. The Dotmailer API Profile will now need authentication. To do this, a dummy account for testing purposes will be used with the username "demo@apiconnector.com" and password "demo". These credentials will first need to be encoded via a scripting tool.

    Please Note

    Since the Dotmailer API uses HTTP Basic authentication, the Connection Options method could also be used as per the previous example.

    • On Linux (for example, on a Matillion ETL instance) the "base64" command can be used to input the credentials via the Bash Script component, as follows:

      echo -n "demo@apiconnector.com:demo" | base64
    • On Windows, PowerShell can be used to input the credentials, as follows:
      $b  = [System.Text.Encoding]::UTF8.GetBytes("demo@apiconnector.com:demo")
      [System.Convert]::ToBase64String($b)

  5. Now the encoded text string can be inserted into the RSD file within the Dotmailer API Profile. This is done by creating a custom "authorization" HTTP header, as follows:

    <rsb:set  attr="Header:Name#"         value="authorization" />
    <rsb:set  attr="Header:Value#"        value="Basic ZGVtb0BhcGljb25uZWN0b3IuY29tOmRlbW8=" />
    Inserting Authorization HTTP header into RSD file

    Inserting Authorization HTTP header into RSD file

  6. The Dotmailer and Campaigns RSD file can now be used with the API Query component without setting Connection Options. If authenticated correctly, running a query should return five records.

    Please Note

    The API Query component can be configured and run within a simple Orchestration Job, or the pre-configured APIQueryDotmailer.json job can be imported into the Matillion ETL environment.

    Inserting Authorization HTTP header into RSD file

    Inserting Authorization HTTP header into RSD file

  7. Building on the above technique, pairs of <rsb:set> declarations can also be used to set any additional HTTP headers. These custom HTTP headers enable authentication to be added to most REST-style API Profiles – allowing "username", "password", "token" and "key" values to be set in specifically named headers, as follows:

    <rsb:set attr="Header:Name#"  value="CustomHeader1" />
    <rsb:set attr="Header:Value#" value="Value1" />
    <rsb:set attr="Header:Name#"  value="CustomHeader2" />
    <rsb:set attr="Header:Value#" value="Value2" />

Method 3: Invoking a SOAP API and embedding authentication into the SOAP request

Matillion ETL can also invoke a SOAP API, with authentication credentials embedded into the SOAP request. However, the method is slightly different.

In this example, the UK National Rail's Live Departure Boards web service (view the document here) will be used, with the GetDepartureBoard service as an endpoint. This will output a list of the trains departing from one named station to another, within a defined time window.

  1. In the Manage API Profiles window, and click + to set up a new API Profile named "OpenLDBWS".

  2. Once the new API Profile has been created, click next to the new profile to open the Configure API Profile pop-up window.

  3. Then, in the Files section, click + to add an RSD file. Provide details for the following fields:

    • Filename – enter "Departure Board" as the name of the RSD file
    • URI – enter "https://lite.realtime.nationalrail.co.uk/OpenLDBWS/ldb9.asmx"
    • Repeat Element – enter "/Envelope/Body/GetDepartureBoardResponse/GetStationBoardResult/trainServices/service" as the part of the SOAP response which lists the matching train services (according to the API documentation), then click OK

    Please Note

    • The xPath value here is not namespace-aware, unlike most other XML processors. So, namespace prefixes don't need to be included in the xPath expression of the RepeatElement.
    • Depending on the current Matillion ETL setting, an RSD file may be outputting with declarations beginning with <rsb:set or <api:set. Both are acceptable and will function the same.
    Add RSD file to OpenLDBWS API Profile

    Add RSD file to OpenLDBWS API Profile

  4. Making a SOAP call also requires a few other adaptations to the basic RSD script, as follows:

    • Line 17 – explicitly setting the HTTP Content-Type header
    • Lines 19 and 20 – set a custom HTTP header which is required by the API
    • Line 24 – requesting an HTTP POST operation (the default is GET)
    • Line 45 – using the xmlproviderGet operation, since SOAP is an XML-based protocol
    • Line 27 to 43 – defining the SOAP request payload as required by the API

    Please Note

    • The SOAP request payload already includes the query parameters on lines 35 to 40 and the authentication token on line 30.
    • Please refer to the pre-configured Departure Board.rsd file for the exact syntax.
    Departure Board RSD file

    Departure Board RSD file

  5. Once the RSD file has been created, a valid API token will need to be acquired from OpenLDBWS Registration portal. Tokens are free and registration only requires a few details.

    OpenLDBWS Registration portal

    OpenLDBWS Registration portal

  6. After registering, the token will be sent via email. This should then be pasted into the XML element at line 30 of the RSD file in place of YourTokenHere.

  7. The OpenLDBWS and Departure Board RSD file can now be used with the API Query component without setting Connection Options. This is a live feed, so if authenticated correctly, the query will yield a variety of results depending on the time of day and how many trains are due to run between the chosen stations in the next few hours.

    Please Note

    The API Query component can be configured and run within a simple Orchestration Job, or the pre-configured APIQuerySOAP.json job can be imported into the Matillion ETL environment.