Solved

NULL Valued fields on DQS

  • 11 March 2023
  • 6 replies
  • 79 views

Userlevel 2
Badge +2

How are NULL’s considered? I have a case statement which gives the right value matching measure -used debug to test. But when it comes to matching rules, it skips the rule. 

Example:

Name      NPI

Ramona  12345678

NULL       12345678

the 2nd row should match the 1st on NPI value and considering on of them is null.

Even the scenario

Name      NPI

NULL       12345678

NULL       12345678

these two should match together based on NPI. I do not see a special character. If I replace NULL with a value ‘A’ , they match with rule Matching on NPI. but does not work for NULL. 

 

Any thoughts or suggestions appreciated. 

icon

Best answer by AKislyakov 17 March 2023, 00:59

View original

6 replies

Userlevel 4
Badge +4

Hi sgilla,

That should not be the case. I tested it specifically on Extended Unification step version 12.6.3 and null values work for me as expected. Can you share your expression for the measure?

For the test I created a new measure testMeasure which should match on null values:


And a new rule TestRule, which rely solely on the measure:


And as expected null values are matched to everything, including the null values.

 

Userlevel 2
Badge +2

AKislyakov thank you for trying a test. 

our matching measure is similar

SC_NAME   

 //Something is null
    center.mat_name is null
    or
    slave.mat_name is null,
    1,

and 

SC_NPI if match then 2

So Matching rule is looking for SC_NAME is 1 and SC_NPI is 2

Would having multiple group unions cause issue? As the groups look like

  1. Union 1: all records except where Name not like ‘ABC INC’
  2. Union 2: Only records with NAME like ‘ABC INC’  

Reason for these groups: They have thousands of similar records and do not want them to be processed along with the main group to improve performance. Every time a similar record ‘ABC INC’ comes in all the similar records are being considered in matching and matched through this separate group. 

The records I have trouble with do not fall in this separate group, thought I mention. 

Userlevel 4
Badge +4

It shouldn’t be a problem. But are the null values falling to either of these groups?

Depending on how exactly the groupping expressions are written it might be that null values are just lost.

 

E.g., if you are using something like that for group separation:

Group 1

not matches(".*ABC INC.*", mat_name)

Group 2

matches(".*ABC INC.*", mat_name)

then suddenly mat_name=null will produce the null result and will be ignored by both groups. 

 

Userlevel 2
Badge +2

You are right. The nulls are not falling in any of the groups but if I exclude all groups nulls are fine, if I add groups even not related single group, nulls get separated.

And if I don't have groups for these large sets the performance is bad. 

Userlevel 4
Badge +4

It is hard to give any meaningful performance advice without knowing all of the details and structure of the data.

Still here is one idea: Extended Unification Step allows you to have multiple unify operations, so you can have a pre-matching step that will use sc_npi as a key and match your null-records to other ones. And then you can use result of this matching to properly select the group.

Userlevel 2
Badge +2

That is a good idea. I will try it out.

Alternative which worked for me is using a temporary mat_name and defaulting Null’s to an alphabet. And this is discarded it later. 

Reply