Compare and match 2 columns - ADD columns with mismatch result

Hello, Knime users!

I am a newbie on Knime and I was looking for a way to compare similar match from two different columns and files, so I found the topic below:

It worked very well for what I needed but I still wish to go a bit further.
I need to extract some more information using the query created my Mr. @aworker, and here is where I am stuck, and where I need your help.

1-) I need one additional column that can say for me what word it didnt find from one file to the other, for example:
File 1 has a row with the value: Honda Civic 1.5 Gasoline
The similarity process found a row in File 2 with the value: Honda Civic 1.5 Diesel
It has a distance of 0.25
I need a column saying that the word it did not find was Gasoline

2-) It would be good if I could get more then a unique result from the match, for example: I want to know every results where the distance on similarity check is from 0 to 0.3, not only the most similar match, and if it did not find anything on this range it should show no result at all.

Is any of these two topics possible to add on the original query? And if it how could I do that?

Thank you very much!

Just a quick update, it seems like if I change the neighbor count it would solve my second problem, of displaying more than one result, so now I just don’t know how I could solve the first issue.

Hi @guilhermelima01 -

Maybe you could post your workflow in progress with a bit of sample data, so folks could more easily assist?

Hello @ScottF , I am going to provide my currently WF and some data:

File_1.xlsx (9.0 KB)
File_2.xlsx (10.0 KB)

First, these two files I am using as example. So as I mentioned before, I need to make the similarity check between them and find the correct ID from one to another, but there will be situation where the file 2 has more data and some are quite similar, for example the vehicle Gol has two different version the III and the V, other than that is everything equal, so I need to have both on my final result, this is solved I guess.

What I also need is a column where I can see what was not equal between file 1 and 2, and using again the Gol as example, in this new column I need to see that what was not found on file 1 was the III and V from file 2.

Bellow you can see how the WF looks like right now:

Similarity_check_test.knwf (91.4 KB)

Thank you again for all the support!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.