Skip to main content

Dear Colleagues,

 

I am seeking a solution for looping over the data and also for using iteration.

What I need to achieve: matching a 4-value string against another 4-value string for Levenshtein match score.

Method is to match every 4 value in the string against every 4 value in the other string. It adds up to 16 combinations.

Somehow like below I want to create a loop:
String1:= L1+' '+L2+' '+L3+' '+L4;
String2:=MATCH_L1+' '+MATCH_L2+' '+MATCH_L3+' '+MATCH_L4;
MS:= null; //this will be the match string – a dynamically extending string as shown below

for i := 0 to 3
MV1 := set.subSequence(String1,i,1); //match value is L1
For i2 = 0 to 3
MV2:= set.subSequence(String2,i2,1); //match value is MATCH_1
MC:=levenshtein(MV1,MV2); //we have some score here as match case
Next i2;
MS:=MS+MC+'_MC'+toString(i)+';'; //match string value and case number stored here and in each loop, new values are getting added to the string

Next i;

MS //here I use the final match string value as it is and that's all for now.

Maybe the same is easier to edit when pasted as text:

Somehow like below I want to create a loop:

String1:= L1+’ ’+L2+’ ’+L3+’ ’+L4;

String2:=MATCH_L1+’ ’+MATCH_L2+’ ’+MATCH_L3+’ ’+MATCH_L4;

MS:= null;          //this will be the matchstring – a dynamically extending string as shown below

 

for i := 0 to 3

MV1 := set.subSequence(String1,i,1);                //matchvalue is L1

               For i2 = 0 to 3

MV2:= set.subSequence(String2,i2,1);              //matchvalue is MATCH_1

MC:=levenshtein(MV1,MV2);                                   //we have some score here as matchcase

               Next i2;

MS:=MS+MC+’_MC’+toString(i)+’;’;                                                     //matchstring value and case number stored here and in each loop, new values are getting added to the string

 

Next i;

 

MS         //here I use the final matchstring value as it is

 

 

So the vague idea with iteration is the below, but this syntax is not compatible with Ataccama component steps coding:

Obviously, the ‘for’ method won’t work as written above.

Is there any similar solution I can use to iterate through the elements of String1 and String2?

Using SQL is not an option.

Thank you in advance for any idea.

 

 

 

 

Hello!
I roughly understand what you mean, but just to be sure - should the MC variable be used in your algorithm outside the inner loop when it is replaced in each loop?
Unfortunately, the expression language does not support loops. However, this particular case or a similar one can be solved somewhat crazy using the set.mapExp function, smth like this:

set.mapExp(string1, ' ', (x) {
set.mapExp(string2, ' ', (y) {
levenshtein(x, y, false) + '_MC'
})
})

or, when the iterating index is necessary, with some mystic around sequence function:

$i0 := namedSequence('temp_seq_i') + 1;
set.mapExp(string1, ' ', (x) {
$i := namedSequence('temp_seq_i') - $i0;
set.mapExp(string2, ' ', (y) {
levenshtein(x, y, false) + '_MC' + $i
})
})

The mapExp function splits the string to elements and each element is passed to “lambda function” and then again concatenated. In this case two nested calls handle required each-to-each evaluation of levenshtein.


Reply