Iterator Components

Iterator Components


Overview

Iterators work by repeating the same work over and over, but with different parameter values each time. An Iterator connects to another component and can repeatedly execute that component, over a set of values. Each iterator has its own means to get to those or build those values. They are powered by variables (every iterator will associate the specific values with a variable) to build these structures. Variables are passed to the component, which decides what to do with them.

 

We can choose between sequential (where jobs are run one after the other) and concurrent (where jobs run simultaneously, if they won't clash with each other) iterations.

 

The Iteration components are analogous to a loop in programming (see here).

 

As an example, the File Iterator component can load everything between flights_2001.csv and flights_2019.csv, without needing 19 components to do this. The component loops 19 times and each loop (iteration) is a single step from x to y. The component searches for files in a number of remote file systems (e.g. S3, an FTP server, Windows File Share, etc.), running its attached component once for each file found. The URL is the path to the file folder. Without the iterator, we would have to hard code the values individually.

 

With the Fixed Iterator component, we define the variables we intend to iterate, and provide default values. Iteration values can be changed, e.g. for quarter one, the quarter two etc.

 

The Loop Iterator component repeats the same thing a number of times. An iterator is simply a loop that goes from x to y (where x and y depend on the component and its configuration). The user can iterate between two numeric values that contains a function that relies on the current value to do something. This avoids having to create several individual jobs. We find a variable to iterate, and provide a sensible default value (e.g. 2010). The iteration can start at 2010 and iterate until the current year, using an expression. Two Loop Iterator components can be nested together to create tables based on year and quarter (for example).

 

For the Table Iterator, the iterations are set up in advance. We define the variable to iterate, and provide sensible default values. For each row in the table, the attached transformation runs. Each row is a source of iteration.

 

A Grid Iterator implements a simple loop over the rows of a Grid Variable. A Grid Variable can hold a tabular structure, and can hold multiple values of different types.

 

Example

For our example, we are transferring files to a new file location, using a File Iterator attached to a Data Transfer component.

The files have been transferred to the new folder using the File Iterator component in conjunction with the Data Transfer component.