Cloud Storage Unload
Cloud Storage Unload
The Cloud Storage Unload component writes files from a table or view in Google BigQuery into a specified Google Cloud Storage (GCS) bucket.
|Name||String||Input the descriptive name for the component.|
|Project||Select||Select the target BigQuery project to load data into. [Environment Default] is set as the default and uses the project defined in the environment.|
|Dataset||Select||Select the target BigQuery dataset to load data into. [Environment Default] is set as the default and uses the dataset defined in the environment.
For more information on Google Cloud Datasets, visit the official documentation
|Table||Select||Select the table or view from which data will be unloaded to the GCS bucket.|
|Google Storage URL Location||Filepath | Select||Select the Google Cloud Storage bucket. Users can click through the file tree, or use the URL template:
|Output Object Name||String||Specify a name for the output object (the object that will be created in the chosen GCS bucket).|
|Format||Select||Select the format of the data. Users can select one of: AVRO, CSV, JSON (New line delimited).|
|Include Header||Yes | No||(CSV format only) Select "Yes" to add a header line to the top of each file that has a column name. The default setting is "Yes".|
|Compression||Select||(AVRO format only) Select the AVRO file format compression type. Options include: Deflate, Snappy, or no compression (None).
(CSV, JSON formats only) Select whether or not output files are to be compressed via GZIP compression.
|Delimiter||Delimiting Character||(CSV format only) Specify a delimiter character to separate columns. The default value is a comma ,
A [TAB] character can be specified as "/t".
In this example, we have a table of email data that we wish to back up on a bucket for long-term storage. But we also want to create a copy of the table to transform, leaving the original table intact. One of the many ways to do this is to unload the table to a bucket using the Cloud Storage Unload component, then reload that data into a new table using the Cloud Storage Load component. The job layout is shown below.
The Cloud Storage Unload component properties are shown below. The component is pointed to the table we want to unload through the "Table" property and we choose a "Google Storage Location URL" to place the physical file on. At this location, a file with the name given in "Output Object Name" will be created there with your data. Choosing a format, header, and compression type is completely at the discretion of the user.
This creates the file 'docs_unload' on the storage bucket. This file is then read back in by the Cloud Storage Load component and loaded into the 'docs_email' table created by the Create Table component. Sampling the new 'docs_email' table in a Transformation Job confirms that the data has been unloaded and loaded correctly.