Data Lineage

Data Lineage


Data Lineage is an Enterprise Feature that allows users to track columns across transformations in order to understand how that column has been affected by components up to the point of inspection.

Data Lineage can be accessed via the 'Data Lineage' tab available after selecting a transformation component on the job canvas.


Consider the simple Transformation Job shown below. At any point, we could sample a component and see the current values that each columns holds. However, identifying where that column has come from and the journey it has taken can be difficult to determine, especially if one is not the creator of the job.

By selecting a component and browsing to the Lineage tab, we can attain the lineage for the data as per the conclusion of this component.

In the example below, we have checked the lineage of the Calculator component which yields 6 columns. Clicking one of the columns, projectid, we can see the lineage of this column on the right-hand side. Using the arrows to expand the tree, we can follow the PROJECTID column through different components. Each entry is given in the format:

<ComponentName or TableName>.<ColumnName>

We see that this column originated in the table DOC_TBL and has been brought in by a Table Input component, passed through a Convert Type component and finally ended up in the Calculator component that we are inspecting.