Connect to any API - Example - Jira Cloud

Connect to any API - Example - Jira Cloud


Overview

This page contains a worked example of Connecting To Any REST API , this time using the Jira Cloud REST API.
The first of the four steps is locating the API documentation.

 

Step 1 - Find the API documentation

Atlassian publish this online guide on their REST API. You can find links from there to the reference documentation on the current version, which was version 2 at the time of writing.

Your Matillion instance must be able to access the API endpoint https://yourcompany.atlassian.net over the network. You can test the connectivity using the Network Connectivity Test shared job which you can download from the Data Stagers - Support page. If Matillion does not have network connectivity to the API endpoint you must get this resolved with your network administrator before continuing.

The Atlassian documentation answers the following questions:

  • How do you authenticate? Using a username and token, described in this document

  • How do you supply parameters? GET and POST methods are both supported. In this example we’ll use GET parameters in the URL

  • Does the API provide data at the necessary granularity? The Jira Cloud API is large, and this example demonstrates querying Issues, and the return format is JSON with embedded arrays as documented here

  • Does the API use paging, and if so how does it provide the information needed to page through the result set? Yes, this is an example of an ‘offset’ pager and is documented here. You can use the startAt, maxResults, and total parameters to extract the full set of data.

 

Now it’s time to put the theory into practice...

 

Step 2 - Load the first page

For convenience it’s often best to use the Matillion server itself to extract and load the first page of data.

To get started, SSH into your Matillion server. It will save some typing later if you set up environment variables containing your company name and security credentials. 

export mycompany=YourCompanyNameHere

export mycredentials=YourName@YourCompany:YourJiraToken

For example:

You can verify network connectivity to the API endpoint using the nc command:

nc -w 5 -v "${mycompany}.atlassian.net" 443

You should see a success message.

If you see errors instead, such as these:

  • nc: getaddrinfo: Name or service not known

  • (tcp) timed out: Operation now in progress

.. you will need to contact your network administrator to make sure Matillion has been granted network access to Jira Cloud.

If connectivity was successful you can extract the first page, with the following command:

curl -k -o page1-raw.json https://${mycompany}.atlassian.net/rest/api/2/search?maxResults=100 --user "${mycredentials}"

Check the output file:

ls -l page1-raw.json

It will be helpful to prettify the JSON so it’s more easily human-readable.

jq "." < page1-raw.json > page1.json

You should now be able to see two files: the raw and the prettified JSON.

ls -l page*

If you need to copy the prettified JSON file onto your local workstation, you can do so with an scp command such as this:

scp -i YourKey.pem YourUser@YourIP:YourFolder/page1.json .

Verify that the data you have extracted looks like the example from the Jira Cloud API documentation.

Now you are ready to upload this as a JSON file into a new API Profile. Follow the steps in Connecting To Any REST API as a guide.

Afterwards find its full path on the Matillion server, for example:

/usr/share/tomcat8/api_profiles/JiraCloud/page1.json

The repeating group in this example is an array element named issues.

In the generate dialog, supply the parameters for this API:

  • XPath to Repeat Element - /issues

  • URI - /usr/share/tomcat8/api_profiles/JiraCloud/page1.json

Once the new RSD file has been created, remove the rsb:script sections associated with INSERTS, UPDATES and DELETES, then press the Test button. A line for t1 should appear at bottom left. Press t1 and the data should appear at bottom right.

In this example, we’re expecting one record per Issue. You should see a sample of your Issue data.

You may now take the opportunity to refine which fields you want to see. Jira issues have a lot of fields, so you can remove any which you don’t intend to use. Do this by editing the <attr> elements within the <rsb:info> near the beginning of the file.

 

Switch the RSD to the real URI

Once you’ve got to the point where the first page of data is appearing correctly, it’s time to switch to using the actual API instead of the sample file!

In the RSD definition, replace the "uri" value with the actual API endpoint. You should use exactly the same URI that you used earlier with the cURL command. Use the actual value for the endpoint address rather than ${mycompany}.

Now the API Query will hit the real endpoint you’ll need to add some authentication. The Atlassian API documentation indicates that Jira Cloud’s Token method uses HTTP Basic authentication, so you can add the three necessary HTTP parameters: User, Password and AuthScheme. Use the actual values rather than ${mycredentials}.

After saving the changes, test the API again and verify that it still retrieves the correct data, at the required granularity.

 

Step 3 - Establish Paging

The first two steps are common to all paging implementations.

First add a new input named 'Rows@Next' to the list of columns inside the existing rsb:info block, after all the other <attr> declarations.

<input name="Rows@Next" desc="Identifier for the next page of results" />

Second, add the following to the rsb:script section just after the <rsb:info> block:

<rsb:set attr="EnablePaging" value="TRUE" />

According to the Jira Cloud API documentation, pagination is performed by the client using the startAt parameter for all pages after the first.

The documentation also warns that the consumer needs to be careful with the row counts:

The actual number of items returned is an implementation detail and this can be changed over time. You can ask for more than you are given.

 

In other words, the actual number of rows returned may differ from the number requested, even between pages. So we need to implement a client-side record counter.

To declare the counter, add this line after the EnablePaging added earlier:

<rsb:set item="userns" attr="rowcounter" value="0"/>

To make it count the records as they are returned, add the following two lines together between the jsonproviderGet and the <rsb:push/>:

<rsb:set item="userns" attr="rowcounter" value="[userns.rowcounter | add(1)]"/>

<rsb:set attr="Rows@Next" value="[Rows@Next | add([userns.rowcounter])]" />

The URL will need to change from page to page, so instead of hardcoding it as you did previously, you must replace the fixed <rsb:set attr="uri" …>  with a conditional block like this:

<rsb:check attr="Rows@Next">

<rsb:set attr="uri" value="https://matillion.atlassian.net/rest/api/2/search?startAt=[_input.Rows@Next]&maxResults=100" />

<rsb:else>

<rsb:set attr="uri" value="https://matillion.atlassian.net/rest/api/2/search?maxResults=100" />

<rsb:set attr="Rows@Next" value="1" />

</rsb:else>

</rsb:check>

Note that special characters in the URI must be replaced by an appropriate XML character reference. These include

Character

Replacement

&

&

<

<

>

>

Space

 

You should test the API inside the Manage API Profiles editor again now, to make sure it’s still working. It’s not possible to verify the paging in this way because it only shows the first 50 records. But assuming you still see some data, you can move on to using an API Query component in an Orchestration job.

 

Switching to an API Query

Add an API Query component to a new, empty Orchestration job, and fill in your Profile name, Data Source etc.

In the Connection Options you will need to add your authentication details again, by adding settings for User, Password and AuthScheme. It’s fine to hardcode these for now: they will be parameterized later as part of productionization.

You may find a default limit of 100 on the component. If so remove it, leaving that setting blank.

When you run the component, it should automatically page through all the Issues, and create a database table containing the columns you chose in the API Profile. Depending on your data volumes this step may take several seconds to complete.

 

Step 4 - Productionize

By this step of the process you should have a fully working API Profile, but using hardcoded authentication parameters.

It’s bad practice to leave hardcoded authentication parameters in place, for two reasons:

  • It’s insecure: anyone logged into Matillion can read the credentials in plain text, so they need to be properly protected

  • Authentication parameters usually expire at some point, so the existing API Profile has a limited shelf life

To proceed, please follow this Using Parameters With API Profiles which describes how to use parameters with API profiles.

Matillion’s Password Manager function is not available for use with API Profiles, so you may choose to follow this Using KMS encrypted passwords in Python which describes how to implement your own password management functionality to protect your private API password.