Skip to main content

Hi

I would like to ask whether we could do fuzzy check in Ataccama that involves two different datasets let’s say Set 1 and Set 2 from different data sources. Please note that both datasets have different number of columns but fuzzy checks involved only customer name and address. However, these datasets don’t have any unique identifier. Is it achievable in Ataccama and what would be the possible steps to achieve the results after the fuzzy checks? Thanks.

Hello ​@Radziah,

to build a DQ check on top of two data sets, you have to first join those together to create 1 dataset because at the moment the DQ rules do not support cross-table DQ checks. To achieve this, you can use the Virtual Catalog Item (since you mentioned the data are from different sources).

https://docs.ataccama.com/one-desktop/15.4.0/work-with-ataccama-one/virtual-catalog-items.html

Was this what you were asking or are you also interested in the fuzzy DQ checks? If that’s the case, can you provide more details on what this fuzzy DQ check should be? 

Thank you.

Kind regards,

Anna


Reply