Duplicate Entity Identification

(Inna Burmistrova) #1

Is it possible to execute a duplicate entity analysis using algorithms such as ​_Soundex_​ or ​_Metaphone_​ using multiple attributes?


Hi Inna,

Identification and processing of duplicates are only several of plentiful functionalities available in Ataccama’s commercial tool DQC.

In DQA, there are functions to transform data using Soundex, Metaphone or Double Metaphone.

You can define functions in the Expression field of the Text File Writer step, the Column Assigner step or the Profiling step of a plan.
In addition, you can add the result of the transformation as another column for analysis.

If you are interested in our commercial tools, request a PoC. We will be pleased to demonstrate you how easy it is to identify and process duplicate entities using our tools.