Detect Changes Component
Detect Changes Component
The Detect Changes component lets users scan two separate (but similar) tables, and insert a new column detailing if data has been inserted, deleted, changed, or even if the data is unchanged.
Any rows with key columns that contain NULL values will be ignored. NULL comparison values are considered equal.
The table below cites the Detect Changes component's setup properties, including any actions required of the user.
|Name||String||Input the descriptive name for the component.|
|Master Table||Select||Select a master table from the two inputs. This table is the one treated as default in the comparison with the second table.|
|Match Keys||Multiple Select||Select the key columns to join the two tables on. These columns must appear in both tables.|
|Compare Columns||Multiple Select||Select the columns that will be checked for changes. Just like the keys, these columns must appear in both tables; however, the two lists should not overlap.|
|Output Column Mapping||Input Column||Select input columns to map to output names. Sensible defaults are provided automatically; however, these can be changed.|
|Output Column||Name output columns to which selected input columns will map.|
|Indicator Column||String||Input a name for the new column in the output. By default, this column is named "Indictator". This column contains an indicator that shows the status of each record:
C the record has been changed.
D the record has been deleted.
I the record is identical.
N the record is new.
Note: switching the master table in the Master Table property will reverse the meaning of new (N) and deleted (D).
Detects changed, unchanged, added, or deleted data in a comparison table relative to the designated master table.
IndicatorsIndicators are single-letter codes that indicate what the state of a row is with regard to Detect Changes. The table below shows all indicators and their meanings.
|C||Changed: the record is present in both tables, with different values, but with the same ID.|
|D||Deleted: the record is present in the master table, but not in the second table.|
|I||Identical: the same record is present in both tables with no changes.|
|N||New: the record is not present in the master table, but is present in the second table.|
In this example, we have two tables containing similar data, and we want to quickly find which rows are different between the two. Our main table contains data for tasks that are to be completed, and the second table holds similar-but-updated data on those tasks. The Transformation Job is shown below.
The two tables feed into a single Detect Changes component. We have chosen the original table of task data to be the master table, although it matters little in this case, since both lists are complete and very similar. We choose to match the two tables by their ID, which is the ID of the task and as such does not change. We then choose to compare the tables via the 'status_id' column, containing values that change according to how near to completion a given task is.
It is instructive next to review the data we are feeding into the Detect Changes component. Below we see the two sets of data. Note that they are very close to identical, with exception to the 'status_id' values, which can change over time.
Running this job will produce the data sampled below. Each row is identified by the ID number and has an associated "Indicator" column. This indicator tells us how each particular row has changed. In this case, we can see that many rows are identical (I). However, some rows have been changed (C), so we know those tasks have been worked on.