Rewrite Table Component

Rewrite Table Component



Rewrite Table Component

Write the input data flow out to a new table.

Runtime errors may occur, for example if a data value overflows the maximum allowed size of a field.

Note: The output table is overwritten each time the component is executed so do not use this component to output permanent data you do not want to overwrite.



Redshift Properties

Property Setting Description
Name String Input the descriptive name for the component.
Schema Select Select the table schema. The special value, [Environment Default] will use the schema defined in the environment. For more information on using multiple schemas, see this article.
Target Table Select Choose a target table name.
Note: Older versions of this component prepended 't_' to the name to help avoid clashing with existing tables; however, this is no longer the case.
Table Sort Key Select This is optional, and specifies the columns from the input that should be set as the table's sort-key.
Sort-keys are critical to good performance - see the Amazon Redshift documentation for more information.
Sort Key Options Select Choose the type of sort key to be used. Compound: A compound key is made up of all of the columns listed in the sort key definition, in the order they are listed. Most useful for tables that will be queried with filters using prefixes of the sort keys.
Interleaved: An interleaved sort gives equal weight to each column, or subset of columns, in the sort key. Most useful for when multiple queries use different columns for filters.
Table Distribution Style Select Even: Distribute rows around the Redshift cluster evenly.
All: Copy rows to all nodes in the Redshift cluster.
Key: Distribute rows around the Redshift cluster according to the value of a key column.
Table distribution is critical to good performance - see the Amazon Redshift documentation for more information.
Table Distribution Key Select This is only displayed if the Table Distribution Style is set to Key. It is the column used to determine which cluster node the row is stored on.

Snowflake Properties

Property Setting Description
Name String Input the descriptive name for the component.
Warehouse Select Choose a Snowflake warehouse that will run the load.
Database Select Choose a database to create the new table in.
Schema Select Select the table schema. The special value, [Environment Default] will use the schema defined in the environment. For more information on using multiple schemas, see this article.
Target Table Select Choose a target table name.
Note: Older versions of this component prepended 't_' to the name to help avoid clashing with existing tables; however, this is no longer the case.

BigQuery Properties

Property Setting Description
Name String Input the descriptive name for the component.
Target Project Text Enter the name of the Google Cloud Platform Project that the table belongs to.
Dataset Text Enter the name of the Google Cloud Platform Dataset that the table belongs to.
Target Table Select Choose a target table name.
Note: Older versions of this component prepended 't_' to the name to help avoid clashing with existing tables; however, this is no longer the case.

Strategy

Drop and recreate a target table, and at runtime perform a bulk-insert from the input flow.

Example

A sum of airtime, grouped by Year and Month, is written to t_airtime_totals each time the job is run.

The output table has the sort-key set to Year and Month, and is distributed by Year.