Cloud Storage Unload

Cloud Storage Unload


This article is specific to the following platforms - Snowflake - BigQuery.

Cloud Storage Unload

Creates files on a specified Google Cloud Storage Bucket, and loads them with data from an Google Cloud Platform table.



Properties

Property Setting Description
Name Text The descriptive name for the component.
Project Text The name of the Google Cloud Project the source table exists on.
Dataset Select Select the table dataset. The special value, [Environment Default] will use the dataset defined in the environment.
For more information on Google Cloud Datasets, visit the official documentation.
Table Text The table or view to unload to a Bucket.
Google Storage URL Location Select Choose the destination for the new object.
Output Object Name Name Give a name to the object.
Format Select CSV
JSON (New line delimited): This requires an additional "JSON Format".
AVRO: This requires an additional "AVRO Format".
Include Header Select (CSV format only.) Defaults to 'Yes'. Selecting 'Yes' adds a header line to the top of each file that has a column name.
Compression Select Output files can be compressed using GZIP compression if selected.
Delimiter Text (CSV format only.) The delimiter that separates columns. Default character is a comma [,]. A [TAB] character can be specified as "/t".

Example

In this example, we have a table of email data that we wish to back up on a bucket for long-term storage. But we also want to create a copy of the table to transform, leaving the original in tact. One of the many ways to do this is to unload the table to a bucket using the Cloud Storage Unload component, then reload that data into a new table using the Cloud Storage Load component. The job layout is shown below.

The Cloud Storage Unload Unload component properties are shown below. The component is pointed to the table we want to unload through the 'Table' property and we choose a 'Google Storage Location URL' to place the physical file on. At this location, a file with the name given in 'Output Object Name' will be created there with your data. Choosing a format, header and compression is completely at the discretion of the user.

This creates the file 'docs_unload' on the storage bucket. This file is then read back in by the Cloud Storage Load component and loaded into the 'docs_email' table created by the Create Table component. Sample the new 'docs_email' table in a Transformation job confirms that the data has been unloaded and loaded correctly.