Skip to main content

Hi everyone,

We have created several monitoring projects in Ataccama, each dedicated to a specific catalog item. Our approach is to create special tables per clientId and measure those within the monitoring project-effectively one monitoring project per catalog item.

Could you please share your thoughts on this approach? Is this a best practice or are there more efficient ways to handle multiple client-specific data quality checks?

Additionally, we want to create individual postprocessing plans for every rule in a monitoring project. For example, if a catalog item has 20 rules, we want 20 separate postprocessing outputs (Excel files) listing the bad records per rule. This is to share these sheets with data stewards who will then fix the data and upload it back.

However, we are encountering a limitation: Ataccama currently does not allow more than 10 postprocessing plans per monitoring project.

What are your recommendations or best practices to handle this? Should we split monitoring projects differently or is there an alternative way to generate and share granular postprocessing results per rule?

Any advice on managing large-scale monitoring and postprocessing workflows in Ataccama would be greatly appreciated!

Thank you in advance!

 

The best practice that I have found to work the most efficiently is group catalog items and Data Quality checks where it makes sense. Because these projects are based on Client ID’s, it makes sense to split them out into individual monitoring projects. But - if there is logical groupings of those Client ID’s based line of business, or other factors, it would make sense to group those together as well.

 

I am unsure of your dataset, and how broad access is to these records, but there could also be a way to pull in all of the records you are attempting to evaluate into a single catalog item, and then create a rule that evaluates based on the Client ID, with multiple conditions for each Client ID. The condition that fails will inform you of which client you are needing to remediate. This will also allow for a singular post processing plan that is just filtering for all invalid records, also simplifying the mass amount of post processing plans to be ran.

 

Your current approach by no means is incorrect, and appears to be working and solving your need!