Cloud Storage Unload

Cloud Storage Unload


Cloud Storage Unload

The Cloud Storage Unload component writes files from a table in Google BigQuery into a specified Google Cloud Storage (GCS) bucket.

This cannot be used to unload views. Users wanting to unload views should first create a table with that view's metadata using a Create Table component. Next, use a Table Input component to select your view, then connect it to a Table Output component to copy the data to the new table. Finally, use your new table in the Cloud Storage Unload component.



BigQuery Properties

Property Setting Description
Name String Input the descriptive name for the component.
Project Select Select the target BigQuery project to load data into. [Environment Default] is set as the default and uses the project defined in the environment.
Dataset Select Select the target BigQuery dataset to load data into. [Environment Default] is set as the default and uses the dataset defined in the environment.
For more information on Google Cloud Datasets, visit the official documentation
Table Select Select the table from which data will be unloaded to the GCS bucket.
Google Storage URL Location Filepath | Select Select the Google Cloud Storage bucket. Users can click through the file tree, or use the URL template: gs://<bucket>/<path>.
Output Object Name String Specify a name for the output object (the object that will be created in the chosen GCS bucket).
Format Select Select the format of the data. Users can select one of: AVRO, CSV, JSON (New line delimited).
Include Header Yes | No (CSV format only) Select "Yes" to add a header line to the top of each file that has a column name. The default setting is "Yes".
Compression Select (AVRO format only) Select the AVRO file format compression type. Options include: Deflate, Snappy, or no compression (None).
(CSV, JSON formats only) Select whether or not output files are to be compressed via GZIP compression.
Delimiter Delimiting Character (CSV format only) Specify a delimiter character to separate columns. The default value is a comma ,
A [TAB] character can be specified as "/t".

Example

In this example, we have a table of email data that we wish to back up on a bucket for long-term storage. But we also want to create a copy of the table to transform, leaving the original table intact. One of the many ways to do this is to unload the table to a bucket using the Cloud Storage Unload component, then reload that data into a new table using the Cloud Storage Load component. The job layout is shown below.

The Cloud Storage Unload component properties are shown below. The component is pointed to the table we want to unload through the "Table" property and we choose a "Google Storage Location URL" to place the physical file on. At this location, a file with the name given in "Output Object Name" will be created there with your data. Choosing a format, header, and compression type is completely at the discretion of the user.

This creates the file 'docs_unload' on the storage bucket. This file is then read back in by the Cloud Storage Load component and loaded into the 'docs_email' table created by the Create Table component. Sampling the new 'docs_email' table in a Transformation Job confirms that the data has been unloaded and loaded correctly.