Text Output Component
This component creates text files on a specified Amazon S3 bucket, and loads them with data from an Amazon Redshift table or view.
The data can be output to multiple files based on a "per file row count".
Note:This component is similar in effect to the 'S3 Unload' component. Since Text Output pulls the data through the Matillion ETL instance, this component offers some added functionality (such as adding column headers to each file). However, S3 Unload unloads data in parallel directly from Redshift to S3 and so tends to be faster.
|Name||Text||The descriptive name for the component.|
|Schema||Select||Select the table schema. The special value, [Environment Default] will use the schema defined in the environment. For more information on using multiple schemas, see this article.|
|Table name||Text||The table or view to unload to S3.|
|S3 URL Location||Text||The URL of the S3 bucket to load the data into.|
|S3 Object Prefix||Text||Create data files in S3 beginning with this prefix. The format of the output is:
|Delimiter||Text||Defaults to a comma-separator string between values.|
|Compress Data||Select||Whether or not the resultant files on the S3 Bucket are to be compressed into a gzip file.|
|Null As||Text||Replace NULL in the input data with the specified string in the output.|
|Output Type||Select||CSV: If the value contains the specified delimiter, newline or double quote, then the String value is returned enclosed in double quotes.
Any double quote characters in the value are escaped with another double quote.
Escaped: inserts backslashes to escape delimiter, newline or backslash.
|Multiple Files||Select||If set, multiple files will be created each containing up to the maximum number of rows specified.|
|Row limit per file||Integer||Maximum number of rows per file.|
|Header||Select||Defaults to Yes, include a header line at the top of each file with column names.|
In this example, we have a table of airport data that we want to store on S3 for long-term storage. It would also be useful to store it as a CSV since we can then export it to be used in other tools if required. To remedy this, we use the Text Output component to take table data and write it to an S3 Bucket as part of the job shown below.
Properties for the Text Output component are shown below. We choose an S3 URL Location to output the data to and in 'S3 ObjectPrefix' we define a filename for our data. In order to make our data immediately accessible and readable, we choose not to compress it or split it into multiple files, nor to encrypt it. Finally we have chosen the CSV format with our choice of delimiter, '%'.
Running this job will take the table and load it into an S3 Bucket in a CSV format. This could then be reloaded into Matillion ETL using the S3 Load component, if wanted.