Skip to main content

Hello,

I am using Ataccama V15.4 

I am trying to set criticality of certain rules in monitoring projects based on which the overall quality of my datasets should be defined.

I tried using score option but IDK it’s not as convincing as setting levels like sev 1,2 etc.

Thanks.

Hello ​@mp_ataccamauser ,

thank you for your question. If score isn’t what you are looking for, then other solutions I can thing of could be something like:

  • creating an additional field in the DQ rule for this severity, and then during post-processing you could download this information form the metadata and use in your report. Or, in case one rule can have multiple severities depending on where you apply it, you would need to create mutiple of those.
  • Also, instead of additional field you could use a naming convention - that is actually something we used with one customer. So they have something like:

COMP_STRING_SEV1 - DQ Rule checking completeness of string field which is considered as severity 1.

VAL_ADDR_POSTAL_CODE - DQ rule checking validity of an address field Postal code which is non-severe.

Like this during post-processing you can only parse the name of the DQ check, no need to download additional metadata of the DQ rule.

I don’t know your specific usecase so I am not sure if this is what you are looking for but at the moment, besides the score, there is no other out of the box mechanism for this.

Kind regards,

Anna


@anna.spakova  Actually we are re-using some of the rules (e.g., checking completeness of certain data types) on multiple catalog items so naming convention won’t be suitable here, could you please elaborate on adding additional field option? Where exactly it will be do u mean, on desktop/web on rules?  Thanks!


hi ​@anna.spakova  can we use the simple scoring step available in desktop to control the score assignments based on rule weights/severities. I am not sure if we could use these scores in Ataccama web → monitoring projects? If yes we can use, could you help on how to achieve this? Really looking forward for inputs on this. Thanks!


Hi ​@mp_ataccamauser , let me elaborate a bit on the options and answer your questions.

For your first message and naming convention - in my opinion even if the DQ check is used on multiple catalog items, its severity can still be the same, or if not, you can have more of those DQ checks for different severities. So lets say in catalog items A, C and D you have the DQ check COMP_STRING_SEV1 and on catalog item B you apply COMP_STRING_SEV2

As for adding the additional field, you’d have to add it into the metadata model as a new attribute for rules. E.g. you have name or description fileds, so you would add one more which would be for example “severity”. The metadata model customizations is quite a broad topic, so you can refer to this documentation for example: https://docs.ataccama.com/one/latest/metadata-model/metadata-model-tutorial.html Once the field is in, you can extract the severity information during post-processing using the ONE Metadata Reade step. It reuquires some joins with the output data though, as in the post-processing result you get dqCheck id which is not the same as the ID of the rule it is an instance of. You can refer to this comment that has the details: 

 

As for your other question - you can use the scores in the rule itself and set it differently per severity:

 But in the post-processing the final score is the sum of all the scores from all the failed checks per record.

Alternatively, sure, once you parse the failed checks in the post-processing, based on the naming convention/severity attribute you can provide the score manually as well using the simple scoring step. However, I am not sure if that’s going to give you a better result.

Kind regards,

Anna