JDBC Incremental Load

JDBC Incremental Load


Please Note

This article is part of the series on Incremental Load Tools and Shared Jobs.


Overview

Important information for Snowflake users

This component is only available for Snowflake instances on AWS, not Azure nor GCP.

The JDBC Incremental Load component is a tool designed to allow users to easily set up a Shared Job that will incrementally load from JDBC-compliant databases, rather than having to manually create such a job, which would require significantly more expertise.

The JDBC Incremental Load component can be found in the Tools section of the Components panel.

Below is an explanation of what users can expect to find on each section of the load wizard, and what each configuration option entails.

 


1. Connection Details

On this first page, users can select the type of database to be accessed, and the authorisation details required to access it.

Property Setting Description
Database Type Select The type of database to be used. Currently supported database types:
 
Connection URL String Specify the connection URL for the database. The format of the URL varies considerably; however, a default template URL is offered once users have chosen a database type. Replace any special tags in the URL template with real values.
Although many parameters and options can be added to the end of the URL, it is generally easier to add them to the Connection Options, documented below.
Username String Enter the username connected to the database.
Password Select Select the password connected to the database for authorisation.


2. Connection Options

Use this page of the wizard to set any specific connection options. Add a new connection by clicking and providing both the parameter and the value. To remove a connection option, click when that connection option is highlighted.

Click Test Connection to verify connection settings.

Below are links to the connection options for each available database type.


3. Data Sources

Select any data sources to load from the available selection once the database connection is passed.

Data Sources in the left-hand list can be included in the load. Data Sources in the right-hand list have been chosen for the load. Data Sources can be selected and moved from list to list using the arrow buttons.


4. Source Columns

The columns available from the selected data source(s) are designated. Users can add or remove columns, and also specify whether a column is to be incrementally loaded, in the Select Columns dialog—accessed by clicking .

Users have the option to mark each column as Incremental. Doing so will ensure that only data that exists after this increment (judged by either a future date or higher numeric value) is loaded into the table. This way, the JDBC Incremental Load component can be used to update an existing dataset.

For example, you load 1000 rows of personnel data and each record has an ID column that contains a unique numeric and you set that column to Incremental. The next time you perform this same load, the existing table data is queried and the largest ID number is found. Now only source data rows with an ID larger than that number are loaded into the target table, thus updating it.


5. Configuration (Redshift)

Property Setting Description
Staging Bucket Select Select the S3 bucket where the data will be staged.
Staging Table Prefix String The prefix for the table created.
Staging Schema Select Select the schema.
Target Table Prefix String The prefix common to tables that will be created from the load. Default is empty (no prefix).
Target Schema Select The name of the Schema used to create the target tables.
Target Distribution Style Select All: Copy rows to all nodes in the Redshift cluster.
Auto: Allow Redshift to manage your distribution style. This is the default setting.
Even: Distribute rows around the Redshift cluster evenly.
Concurrency Select The number of concurrent queries. This value can be changed using the Alter WLM Slots component.


5. Staging Configuration (Snowflake)

Property Setting Description
Staging Bucket Select The bucket used to stage the data.
Staging Table Prefix String The prefix given to any tables that this component will create.
Staging Warehouse Select Choose a Snowflake warehouse that will stage the load.
Staging Database Select Choose a database to create each new table in.
Staging schema Select The name of the schema used to stage the data.


6. Target Configuration (Snowflake)

Property Setting Description
Target Table Prefix String The prefix common to tables that will be created from the load. Default is empty (no prefix).
Target Warehouse Select Choose a Snowflake warehouse to load the data in.
Target database Select Decide which database will hold your new tables.
Target Schema Select The name of the schema used to create the target tables.
Concurrency Select Select between concurrent and sequential queries.


Completion

Once the user completes the wizard, a JDBC Incremental Load component will be present on the canvas. This is not identical to the Database Query component, but does utilise the same ideas to perform an incremental load from a JDBC-compliant database.

Clicking on this new JDBC Incremental Load component reveals several of the options given during the setup that can be easily changed at the user's discretion. This may cause the component to stop working and should never be done during a running workflow. This component can be used in all of the ways that any other component can be.