Skip to main content

I have been trying for some time to build a data quality check/rule in Ataccama Desktop (using an advanced expression) to check where a defiend attribute is unique/distinct and have been struggling to find the correct function to use for this.

In a tool like excel I woudl simply count the number of instances of each value and following that I would ensure that anything where the count was >1 failed this check but I have thus far been unable to find a way to perform a distinct count in this way. e.g. DistinctCount((Install_And_Leave_Flag]) = 1

I understand that the profiling step nativley gives you this information but I require a way to do that via an advanced expression for my use case. 

 

Hi @rpainte1,

Record Descriptors are one possible way to achieve your end goal.

There is another post on how to use Record Descriptors, which you might find useful. I have added the link to the post below.


For your use case, I have attached a ONE Desktop plan to demonstrate how it can be used to perform record by record analysis to determine if the value on the current record is distinct.

 

Plan using Randomly generated records

The plan randomly generates some records, either null, install or leave.

Re-run the plan a few times, if you do not get a distinct value the first time. 😀

 

After you have run the plan, have a look at the output.
 

Output showing null and Install values are distinct in the recordset.

Eyeballing, we can see that the null and Install values are distinct.

The Record Descriptor Group Size is equal to 1, and we have explained this observation in the expLAIN and scoRE columns.

I hope this plan helps you with your business rules.
Please let us know how you get on.

Have a great day.

Danny


Thanks Danny for the detailed help! 

It was exactly what I was after and I have now managed to slightly alter and repurpose the plan you shared to work for my use case!

Cheers,

Ross


I have been trying for some time to build a data quality check/rule in Ataccama Desktop (using an advanced expression) to check where a defiend attribute is unique/distinct and have been struggling to find the correct function to use for this.

In a tool like excel I woudl simply count the number of instances of each value and following that I would ensure that anything where the count was >1 failed this check but I have thus far been unable to find a way to perform a distinct count in this way. e.g. DistinctCount((Install_And_Leave_Flag]) = 1

I understand that the profiling step nativley gives you this information but I require a way to do that via an advanced expression for my use case. 

 

Good afternoon--

Just a thought, as you’re licenced to use the ONE Web application, this is functionality that is possible to acheive there. Do you feel that you could switch to the Web interface instead for this?

If you’d like a quick call to run through the options, maybe drop me a DM/email and I could set something up?

 

Matthew Taylor

(Ataccama UK)

 


Yes, aware we can perform this check (much more easily in fact!) via the web application and we are doing this for a number of datasets already. We were previously using IDE however and while we are migrating what plans we can over to web, with the strategic goal of getting it all moved over, this is taking  time and there are some genuine internal reasons why, for a small number of datasets, we are still reliant on IDE for the time being!

 

Thanks for the offer though!

 

Cheers, 

Ross


@matthewtayloruk  can you share the steps for doing this in ONE web?


Hi @suhasjha, thank you for posting! May I ask if you’ve had a chance to check our documentation on this? 


Reply


ataccama
arrows
Lead your team  forward  OCT 24 / 9AM ET
×