MindSphere Extract

MindSphere Extract


This article is specific to the following platforms - Redshift.

Overview

The MindSphere Extract component uses the MindSphere API to retrieve and store data to be referenced by an External Table. Users can then transform their data with Matillion ETL's library of transformation components.

The MindSphere Extract component uses the MindSphere API to retrieve and store data to be loaded into a table. Users can then transform their data with Matillion ETL's library of transformation components.

Important Information

Using this component on Matillion ETL for Redshift may return structured data that requires flattening. For help flattening such data, please read our Nested Data Load Component documentation.

Using this component on Matillion ETL for Snowflake may return structured data that requires flattening. For help flattening such data, please read our Extract Nested Data Component documentation.

Using this component on Matillion ETL for BigQuery may return structured data that requires flattening. For help flattening such data, please read our Extract Nested Data Component documentation.

Note: The tokens loaded from the MindSphere API have a lifespan of 30 minutes. These tokens are stored within an in-memory cache that expires each entry after 25 minutes have elapsed. At this point, a new token is requested.



Properties

The below table cites the MindSphere Extract component's setup properties, including any actions required of the user.

Property Setting Description
Name String Specify the descriptive name of the component. If a job includes more than one MindSphere Extract component, users might wish to name each component based on its data source.
API Select Select which MindSphere API to make queries to.
Data Source Select Select the data source. The available data sources depend on the chosen API. Upon selection, any properties specific to the chosen data source will be available for users to configure.
This property is only available when the API property is set to "Asset Management EU1" or "Event Management EU1".
Auth Method Select This property cannot be edited and will be hidden once the user selects a data source.
API Username String Input the API username for the chosen MindSphere app. For help locating your API Username, please refer to our MindSphere Extract Authentication Guide.
API Password String Input the API password for the chosen MindSphere app. For help locating your API password, please refer to our MindSphere Extract Authentication Guide.
We recommend storing passwords in the Matillion ETL Password Manager.
App Name String Specify the name of target MindSphere application. Users can see an overview of their MindSphere apps, and create new apps, in the Developer Cockpit interface.
App Version Version Number Specify the version number of the target MindSphere application. Each application in the MindSphere Developer Cockpit interface has a version.
Host Tenant String Specify the Host Tenant. Users can find this credential in the MindSphere Asset Manager interface.
User Tenant String Specify the User Tenant. For help creating a user account or assigning MindSphere permissions, visit this page.
Entity Id String Specify your Entity Id. The Entity Id must exist in the IoT Entity Service.
This property is only available when the Data Source is set to "IoT Time Series EU1".
Property Set Name String Specify the name of the Property Set.
This property is only available when the Data Source is set to "IoT Time Series EU1".
Select String Specify the data to select from the Property Set.
This property is only available when the Data Source is set to "IoT Time Series EU1".
Latest Value Boolean When True, Matillion ETL will only return the most recent value recorded for the metrics requested.
This property is only available when the Data Source is set to "IoT Time Series EU1".
From Timestamp Specify the start point of the desired time range in the API call. The format is YYYY-MM-DDThh:mm:ss.mmmZ where T is the literal letter T and indicates the start of the time element. All timestamps must be passed in the ISO8601 format in UTC.
This property is only available when the Data Source is set to "IoT Time Series EU1".
To Timestamp Specify the end point of the desired time range in the API call. The format is YYYY-MM-DDThh:mm:ss.mmmZ where T is the literal letter T and indicates the start of the time element. All timestamps must be passed in the ISO8601 format in UTC.
This property is only available when the Data Source is set to "IoT Time Series EU1".
Limit Integer Limit the number of results per page that will be staged.
This property is only available when the Data Source is set to "IoT Time Series EU1".
Asset ID String Specify the Asset ID.
This property is only available when the Data Source property is set to "Asset ID" or "Variables".
Filter JSON String Provide a JSON string, which will act as an SQL filter for your query. This parameter is optional.
Page Limit Integer Limit the total number of pages to stage.
Location URL Select an S3 bucket path that will be used to store the data. Once the data is on an S3 bucket, it can be referenced by an external table. A folder will be created at this location with the same name as the target table.
External Schema Select Select the table's external schema. To learn more about external schemas, please consult the "Configuring The Matillion ETL Client" section of the Getting Started With Amazon Redshift Spectrum documentation. The special value, [Environment Default], will use the schema defined in the environment. For more information on using multiple schemas, see Schema Support.
Target Table String Provide a name for the external table to be used.
Warning: This table will be recreated and will drop any existing table of the same name.
Location URL Select an Amazon S3 bucket, Azure Blob Storage path, or Google Cloud Storage bucket that will be used to store the data. Once the data is on the chosen bucket or blob, it can be referenced by an external table. A folder will be created at this location with the same name as the target table.
Warehouse Select Select a Snowflake warehouse that will run the load.
Database Select Select a database to create the new table in.
Schema Select Select the table schema. The special value, [Environment Default], will use the schema defined in the environment. For more information on using multiple schemas, see this article.
Project Select The target BigQuery project to load data into.
Dataset Select The target BigQuery dataset to load data into.
Target Table String Provide a new table name.
Warning: This table will be recreated and will drop any existing table of the same name.
Cloud Storage Staging Area URL The URL and path of the target Google Storage bucket to be used for staging the queried data.
Load Options Multiple Select Clean Cloud Storage Files: Destroy staged files on Cloud Storage after loading data. Default is On.
Cloud Storage File Prefix: Give staged file names a prefix of your choice. The default setting is an empty field.
Recreate Target Table: Choose whether the component recreates its target table before the data load. If Off, the existing table will be used. Default is On.
Use Grid Variable: Check this checkbox to use a grid variable. Default is unchecked.