Manage API Profiles Wizards
Most endpoints that return a list of entities will need to have pagination. Without pagination, a query could return millions or even billions more rows of data than anticipated, causing unnecessary network traffic.
Paging requires an implied ordering. By default, this may be the item’s unique identifier but can be other ordered fields, such as a created date.
Below are the different types of API pagination strategies available in Matillion ETL:
- Full Path
- Relative Path
- Link Header
The strategies listed above can be enabled and managed in the Response Configuration dialog of the Manage Extract Profiles and Manage Query Profiles dialogs, respectively. The methods for adding pagination in these wizards are identical.
This page will provide information about how to configure pagination and strategies for the Manage Extract Profiles and Manage Query Profiles wizards, referred to in Configuring the Response Configuration, as well as listing all of the strategies and properties in the respective wizards.
Configuring the Response Configuration
Refer to Manage Extract Profiles and Manage Query Profiles to access the respective wizards in your Matillion ETL instance.
The Response Configuration dialog in each of the wizards will display a Select Fields section on the left, where you can view the structure of the data returned from the API call, and attempt to identify the repeat element from that data. Users can also specify this manually.
To populate the Repeating Element field you must right-click on the appropriate element in the tree structure, and select Set Repeating Element, or select Set [Param Name], as appropriate. These are not free text fields.
For API Extract profiles, you can only set repeating elements on arrays.
To remove the value from the Repeating Elements field, right-click in the aforementioned field, and click Remove.
As of version 1.55.5, in the Configure Query Connector dialog, the Remove button is not currently functional, and as such, when you set a repeating element in the Response Configuration, you are not able to remove it. This feature will be enabled in a future version.
Optionally, users can configure paging by first toggling the Paging button, situated on the right of the dialog, to ENABLED, and then selecting a paging strategy from the drop-down menu provided. Users should complete the fields related to the selected paging strategy. Refer to Paging strategy properties for more information.
For API Query Profiles, when a paging strategy parameter needs to be specified, it is because the name of the element (for example, CURSOR) used in the RSD file may be different to the actual QUERY used in the response call URI. This applies to all paging options.
Some properties are mandatory, highlighted in red, and require input in order to proceed. Certain properties may become mandatory according to entries in other properties.
Choose the appropriate method to populate these fields. If the tree view selection is required, right-click on an element situated to the left, and click on the name of the corresponding paging field. For more information, refer to Paging strategy properties.
To remove the value from the paging field, right-click in the aforementioned field, and click Remove.
As of version 1.55.5, in the Configure Query Connector dialog, the Remove button is not currently functional, and as such, when you add a strategy property in the Response Configuration, you are not able to remove it. This feature will be enabled in a future version.
Once you have configured all of the steps in both wizards, read Manage Extract Profiles and Manage Query Profiles for more information.
The following examples show the outcome for each type of API profile, using Full Path pagination:
- Endpoint configuration for an API Extract profile.
- Endpoint response in the RSD file of an API Query profile.
In some cases, when you have created an RSD file in the Manage Query Profiles wizard, you may want to escape a text string, as opposed to an individual character, when using full path paging.
Common cases occur where whole properties, such as
@odata.nextLink need to be escaped, and replaced with
<api:set attr="pageurlpath" value="/\\[@odata.nextLink\\]" />. For Matillion ETL to recognise escaping characters, you will be required to modify your RSD file manually.
For more information, read Using parameters with API profiles, and scroll down to 'Escaping square brackets'.
Paging strategy properties
The table below describes all of the pagination strategies, their features, and the implementation methods for the Manage API Profiles wizards. Certain properties are unique to each of the API Profiles wizards.
For the Optional Parameters, if a user sets the paging limit to more than the actual number of rows in the table, all rows will be loaded on one page.
|Strategy||Strategy Description||Property Information|
|Full Path Pagination||The field that will contain the URI (path) to the next page of data.||
|Relative Path||Exclude certain parts of the base URI, and specify the next page to be displayed at the component's run-time.||
|Link Header||After the API call has returned data in the body of the response, it will also return a header parameter containing a full URL of the next page.||
|Offset-based||This is the simplest form of paging. Offset became popular with apps using SQL databases that already have LIMIT and OFFSET as part of the SQL SELECT syntax. Offset pagination will add a record count in the document, and Set Record Count to populate the xpath on the basis of the total number of records.||
|Cursor-based||Insert a cursor into a specified field within the document.||
|Page-based||Sort the API call response by specifying a page number, how many pages you want, and the last page element.||