In V15.4 Is there any way we can add weight/impact of any rule in overall data quality score other than score option.

Question

Hello,I am using Ataccama V15.4 I am trying to set criticality of certain rules in monitoring projects based on which the overall quality of my datasets should be defined.I tried using score option but IDK it’s not as convincing as setting levels like sev 1,2 etc.Thanks.

anna.spakova · Accepted Answer

Hi ​@mp_ataccamauser, let me elaborate a bit on the options and answer your questions.For your first message and naming convention - in my opinion even if the DQ check is used on multiple catalog items, its severity can still be the same, or if not, you can have more of those DQ checks for different severities. So lets say in catalog items A, C and D you have the DQ checkCOMP_STRING_SEV1 and on catalog item B you applyCOMP_STRING_SEV2.As for adding the additional field, you’d have to add it into the metadata model as a new attribute for rules. E.g. you have name ordescription fileds, so you would add one more which would be for example “severity”. The metadata model customizations is quite a broad topic, so you can refer to this documentation for example:https://docs.ataccama.com/one/latest/metadata-model/metadata-model-tutorial.htmlOnce the field is in, you can extract the severity information during post-processing using the ONE Metadata Reade step. It reuquires some joins with the output data though, as in the post-processing result you get dqCheck id which is not the same as the ID of the rule it is an instance of. You can refer to this comment that has the details:As for your other question - you can use the scores in the rule itself and set it differently per severity:But in the post-processing the final score is the sum of all the scores from all the failed checks per record.Alternatively, sure, once you parse the failed checks in the post-processing, based on the naming convention/severity attribute you can provide the score manually as well using the simple scoring step. However, I am not sure if that’s going to give you a better result.Kind regards,Anna

anna.spakova · Answer

Hello ​@mp_ataccamauser,thank you for your question. If score isn’t what you are looking for, thenother solutions I can thing of could be something like:creating an additional field in the DQ rule for this severity, and then during post-processing you could download this information form the metadata and use in your report.Or, in case one rule can have multiple severities depending on where you apply it, you would need to create mutiple of those.Also, instead of additional field you could use a naming convention - that is actually something we used with one customer. So they have something like:COMP_STRING_SEV1 - DQ Rule checking completeness of string field which is considered as severity 1.VAL_ADDR_POSTAL_CODE- DQ rule checking validity of an address field Postal code which is non-severe.Like this during post-processing you can only parse the name of the DQ check, no need to download additional metadata of the DQ rule.I don’t know your specific usecase so I am not sure if this is what you are looking for but at the moment, besides the score, there is no other out ofthe box mechanism for this.Kind regards,Anna

Sign up

Login to the Ataccama Community

Scanning file for viruses.

This file cannot be downloaded