Comparation between 2 columns no exactly 100%

Hello group,

Working with KNIME, and my dataset. I needed to make the comparison between existing references in 2 different columns.

What is my problem, and that’s why I ask for your help…

As an example, I have in column A has the reference AAA11122222BBB-333CCC and column B to compare is 0AA11121222BBBB333CC.

As you can see some character or some is changed, but both strings match at 80% and therefore I would like that for that case it appears as “match” between both.

Can someone help me to do the Workflow?

Thanks in advance

Hi
There is a string similary node. You might want to check it out. There should be some examples on the KNIME hub as well
br

2 Likes

Hello @26AngelG,

To achieve this result, you can use both the String Similarity node and the Java Snippet node.

I used the Java Snippet node. In the Java Snippet node, you can write logic like

Comparation between 2 Column.knwf (73.8 KB)

"String valueA = c_column1;
String valueB = c_column2;

int dp = new int[valueA.length() + 1][valueB.length() + 1];
for (int i = 0; i <= valueA.length(); i++) {
dp[i][0] = i;
}
for (int j = 0; j <= valueB.length(); j++) {
dp[0][j] = j;
}
for (int i = 1; i <= valueA.length(); i++) {
for (int j = 1; j <= valueB.length(); j++) {
int cost = (valueA.charAt(i - 1) == valueB.charAt(j - 1)) ? 0 : 1;
dp[i][j] = Math.min(Math.min(dp[i - 1][j] + 1, dp[i][j - 1] + 1), dp[i - 1][j - 1] + cost);
}
}
int levenshteinDistance = dp[valueA.length()][valueB.length()];

double threshold = 0.8;

double maxLen = Math.max(valueA.length(), valueB.length());
double similarityScore = 1.0 - (double) levenshteinDistance / maxLen;

String matchStatus = (similarityScore >= threshold) ? “Match” : “No Match”;

out_new = matchStatus;".

output

2 Likes

Thanks @Daniel_Weikert !! I will check it

UFFF great help @tqAkshay95!! You help me a lot… I will try it!! THANKS

1 Like

Sorry @Daniel_Weikert but I don´t search it the “String Similary” node in KNIME. I look up through internet and I saw that it is necessary to install a NodePit… I don´t know how I could do it.

Could you help me please? thanks again

It needs to be added to available update sites
There is an older blog which might be helpful

br

1 Like

OK!! I cheked it but I don´t have the last KNIME version, it is possible that NodePit only run in last version? thanks

Which version of KNIME are you using @26AngelG ?

Sorry I didn´t see this post @takbb !!:frowning:

I have this version, but I cannot (or I don´t know) so I can install the NodePit.

Take a look at this recent post.

1 Like

@26AngelG ,

if you have got the solution, please mark it with a green tick.

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.