Solved

How to validate the scenario in description?


Userlevel 3
Badge +4

 "When one column have countries, another column have city" This city should be in that country else it is invalid.

 

Please help.

icon

Best answer by Maxim Kim 23 November 2022, 06:39

View original

13 replies

Userlevel 3
Badge +6

Could you clarify the context? Is it about an arbitrary dataset/table where columns with the countires and cities are unknown and there should be a way to:

  1. identify country column;
  2. identify city column;
  3. for each city in city column check if it exists in the corresponding country that is in country column?

 

col1 col2 col3 ... check
Edinburgh 3452301 United Kingdom   True(Edinburgh is in UK)
Tegusigalpa 4485903 Mexico   False(Tegusigalpa is not in Mexico)

 

Userlevel 3
Badge +4
Location   City ... check
Edinburgh   United Kingdom   True(Edinburgh is in UK)
Tegusigalpa   Mexico   False(Tegusigalpa is not in Mexico)

 

Userlevel 3
Badge +4

I am able to detect the country column but not able to validate the above condition.

Userlevel 3
Badge +6

I am able to detect the country column but not able to validate the above condition.

If country and city columns are known then the condition should rely on additional lookup file (should be prepared in advance):

cities.lkp (example)

city country
Melbourne Australia
Syndey Australia
Kanberra Australia
London United Kingdom
Glasgow United Kingdom
Liverpool United Kingdom
... ...

 

Using this lookup it should be possible to detect in what country city is and compare it to the value of country column

I am not sure if there is a function in expression language one can use in simple validation rules of Ataccama One Web interface, but it should be doable in a component validation rule (using Ataccama IDE)

Userlevel 3
Badge +4

Please provide me the rule.

 

 

Userlevel 3
Badge +6

Please provide me the rule.

 

Unfortunately I don’t have any environment (if you need the rule for the Ataccama One web) where I can try to implement proof of concept of the component validation rule. I believe component based validation rules are covered by one of existing trainings ataccama has.

If we speak about general approach, using DQC (Ataccama IDE) you’ll have to do following:

Build lookup file for the mapping of city to country

 

Excel file reader is for 

Lookup builder

 

Test example data in another excel

 

countries_dataset reads excel file with “data”:

 

Check the data with lookup we’ve built in prev step:

 

Getting city_country from the lookup (mapping of the city to the country):

 

With the column assigner we have to validate if country we get from dataset is equal to the actual country we get from lookup mapping:

 

And the result is in text file:

 

 

Userlevel 3
Badge +4

Thank you.  I have created lookup file like the below: 

I have used the below rule:

It did not work.

Then I created 2 look up files:

 

 

THen Rule:

It did not work.

Userlevel 3
Badge +4

Please suggest something in Web interface.

Userlevel 3
Badge +6

Actually what you can do is to have something like this:

 

 

The steps are:

  1. build a lookup with cities and countries joined with “_”:
  2. upload to Ataccama One
  3. Create a validation rule with 2 input parameters
  4. Select Advanced Expression as a rule logic and check if concatenated CITY + “_” + COUNTRY is in the lookup

 

 

PS

It would be nice to build a lookup with all Uppercased values and use upper cased city and country in the rule not isInFile(upper(CITY + "_" + COUNTRY), "lookup:city_country").

Userlevel 3
Badge +4

Thanks a lot but while creating look up file it is removing ‘_’.

 

CSV:

 

LKP:

 

 

Userlevel 3
Badge +4

I am able to create look up file 

 

 

But the above rule is not working

 

 

Userlevel 3
Badge +4

Its working thanks a lot

Userlevel 3
Badge +6

Its working thanks a lot

This is great!

Reply