Unique value check within the Data Quality Inidcator


(Nicholas Zeolla) #1

Hi,

I am looking to create a rule within the “Data Quality Indicator” to test for uniqueness within a column.

I have built a plan similar to the example file 7.01, but can’t find anything that works when creating a rule to check for uniqueness within a column. Any advice?


(Victoria Tuktarova) #2

Hello Nicholas,

I am afraid there is no such rules to check uniqueness in DQ Indicator. Therefore, I would suggest using countUnique expression in Group aggregator step. As a result it will show which values are unique and which are not with 1 and 0.

Also, you can use Representative creator step to find and remove duplicates.

Kind regards,
Victoria


(Nicholas Zeolla) #3

Hi Victoria,

Than you for your reply, are you referring to the “Groups” heading in a profile?

If so, I’m not really looking for that. I’m looking for a way to track number of non-unique entries within a given column using a plan, similar to how the tutorial file “7.01 DQI plan” can chose the output file based on rules.

If this can’t be done within the DQ Indicator, is there some way to measure unique values within a column using a different function, while keeping it all within the same Plan?


(Maksim Zhelyazkov) #4

Hello Nicholas,

Unfortunately, there is no easy way to track unique entries in the DQI step. As an alternative I would suggest you use either Representative Creator step (explained HERE) or Record Descriptor step (explained HERE) to check the number of unique records.

For example, you can use in the same plan the Representative Creator step, as explained in the link above, to group duplicate records as false i.e. not unique and then the DQI step to count the number of non-unique records in a report.

Let me know if this works for you or you have more questions.

Regards,

Maksim