Lead-Lag
  • Dark
    Light

Lead-Lag

  • Dark
    Light

This article is specific to the following platforms - Snowflake - Redshift - BigQuery - Synapse.

Lead/Lag Component

Allows the user to determine a value from a preceding or following row at a given offset within a group (or partition) of values.


Properties

Snowflake Properties

Property Setting Description
Name Text A human-readable name for the component.
Include Input Columns Yes/No Defines whether the component passes all input columns into the output.
Partition Data List of Columns Defines how the input data is partitioned in order to perform the rank calculation. The calculation is then performed on each partition.
Ordering within partitions Input Column The input column name for sorting within the partitioned data. Note: you can drag to reorder
Ordering The order of the sorting: Ascending (Asc) or Descending (Desc)
Functions Window Function Lead - Returns the column from <offset> rows earlier in the partition.
More... Lag - Returns the column from <offset> rows later in the partition.
More...
Input Column The name of the input column that the lead/lag function will return.
Offset The number of rows to go forward (lead) or backwards (lag) in the partition.
Output Column The name of the output column that the window function will create.
Ignore Nulls Yes/No Disregard Null values when determining which row to use. Null values do not count toward reaching the offset.

Redshift Properties

Property Setting Description
Name Text A human-readable name for the component.
Include Input Columns Yes/No Defines whether the component passes all input columns into the output.
Partition Data List of Columns Defines how the input data is partitioned in order to perform the rank calculation. The calculation is then performed on each partition.
Ordering within partitions Input Column The input column name for sorting within the partitioned data. Note: you can drag to reorder
Ordering The order of the sorting: Ascending (Asc) or Descending (Desc)
Functions Window Function Lead - Returns the column from <offset> rows earlier in the partition. More...
Lag - Returns the column from <offset> rows later in the partition. More...
Input Column The name of the input column that the lead/lag function will return.
Offset The number of rows to go forward (lead) or backwards (lag) in the partition.
Output Column The name of the output column that the window function will create.
Ignore Nulls Yes/No Disregard Null values when determining which row to use. Null values do not count toward reaching the offset.

BigQuery Properties

Property Setting Description
Name Text A human-readable name for the component.
Include Input Columns Yes/No Defines whether the component passes all input columns into the output.
Partition Data List of Columns Defines how the input data is partitioned in order to perform the rank calculation. The calculation is then performed on each partition.
Ordering within partitions Input Column The input column name for sorting within the partitioned data. Note: you can drag to reorder
Ordering The order of the sorting: Ascending (Asc) or Descending (Desc)
Functions Window Function Lead - Returns the column from <offset> rows earlier in the partition.
Lag - Returns the column from <offset> rows later in the partition.
Input Column The name of the input column that the lead/lag function will return.
Offset The number of rows to go forward (lead) or backwards (lag) in the partition.
Output Column The name of the output column that the window function will create.

Synapse Properties

Property Setting Description
Name Text A human-readable name for the component.
Include Input Columns Yes/No Defines whether the component passes all input columns into the output.
Partition Data List of Columns Defines how the input data is partitioned in order to perform the rank calculation. The calculation is then performed on each partition.
Ordering within partitions Input Column The input column name for sorting within the partitioned data. Note: you can drag to reorder
Ordering The order of the sorting: Ascending (Asc) or Descending (Desc)
Functions Window Function Lead - Returns the column from <offset> rows earlier in the partition.
Lag - Returns the column from <offset> rows later in the partition.
Input Column The name of the input column that the lead/lag function will return.
Offset The number of rows to go forward (lead) or backwards (lag) in the partition.
Output Column The name of the output column that the window function will create.

Delta Lake Properties

Property Setting Description
Name String A human-readable name for the component.
Include Input Columns Select Defines whether the component passes all input columns into the output.
Partition Data Column Select Defines how the input data is partitioned in order to perform the rank calculation. The calculation is then performed on each partition.
Ordering within partitions Input Column The input column name for sorting within the partitioned data.
Note: you can drag to reorder.
Ordering The order of the sorting: Ascending (Asc) or Descending (Desc)
Functions Window Function Lead: returns the column from <offset> rows earlier in the partition.
Lag: returns the column from <offset> rows later in the partition.
Input Column The name of the input column that the lead/lag function will return.
Offset The number of rows to go forward (lead) or backwards (lag) in the partition.
Output Column The name of the output column that the window function will create.

Strategy

Generates a select statement with a window function in line using the OVER keyword. More details...
Generates a select statement with a window function in line using the OVER keyword. More details...
Generates a select statement with a window function in line using the OVER keyword. More details...

Example

This example uses the flight data. It uses an aggregation to calculate the total flight time per day and then uses the lead lag to add the flight time from the prior day and the prior prior day for comparison.

Note: The aggregation component groups the data by Year, Month and DayOfWeek and totals the airtime per day (see Aggregate Component for more detail)


The component properties are set up as below. The data is partitioned down to individual months and ordered by day.

The Functions are set up to use the Lag Window Function. Two new Lag columns are created from sum_airtime called Airtime Prior Day and Airtime Prior Prior Day. The former has an offset of one hence the day before the current row, and the latter an offset of 2, two days before the current row.

The Sample data shows the new columns added to the result.


What's Next