Skip to main content
Question

Getting values from previous ataccama run (for monitoring project or profiling runs) and comparing them


I am working on a requirement where I have to check that any values which were present in previous profiling/monitoring project run inside column_A of table_1 should not be absent in current/latest run of column_A table_1 if those are present I have to highlight those, requesting community to help me with this.

Hi ​@Ayush kumar , 

Thank you for your question. 

As Ataccama does not store the actual data permanently, the solution will have to be configures outside of Ataccama, as well. 

  1. But first, please confirm my understanding - if the value inside column_A of table_1 of the last run is again present inside column_A of table_1 of the next run, should you flag it? Or is it the other way around, if it is NOT present again, you should flag it?
  2. Will you need to compare only the 2 closest runs or will you need the whole story of runs?

     

Looking forward to your answers! 
Kind regards,
Ekaterina


Thanks for the reply, here are answers to your questions:

1. It is the other way around, if any data from previous run is not present in current run, then it needs to be flagged in entity containing previous data.

2. Yes, i need to compare only the 2 closest runs.



 


Hi ​@Ayush kumar , thank you for the clarifications.

One of the possibilities could look like:

  1. You could create a lookup from  column_A of table_1 after the DQ evaluation is run on top of it. Depending on how you plan to run Monitoring Projects, you can rebuild lookups manually or schedule it. 
  2. Create a rule that will compare the data in the next run against value in the lookup.

Please note that this is only one of the possible options, which one you choose heavenly depends on the whole solution design and the motivation behind this requirement.

Please let me know if you have any further questions.


Kind regards,
Ekaterina

 


@ekaterina.ponomareva , thanks for the response, please let me know if I am thinking the right way, so according to the approach you mentioned above, the lookup in ONE web app will be containing previous run data (let's assume that catalog item corresponding to lookup updates daily and today I haven't updated the lookup but catalog item has been updated) and catalog item would contain current data, now I have to compare catalog item and lookup using a rule inside monitoring project.


@Ayush kumar, let’s say you run Monitoring Project at 1st of December. After the Monitoring Project is finished, the lookup will be updated and contain all the values presented in columnA in the CI on top of which MP has run. On the 2nd of December, you run the same Monitoring Project again, but the data in the Catalog Item would be different. So, the rule will compare the data in CI on the 2nd of December against the data stored in the lookup(data in CI as of 1st of December). And then the whole process repeats again.

Does this make sense?😊

Kind regards,
Ekaterina


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings