Hello, I am trying to compare the results of same columns from two different excels , and trying to find dissimilarity between them.
I find that string similarity node doesn’T exist anymore, string matcher node doesn’t do the job as expected.
Could you please help me with a node or an example workflow?!
The result should be finding that apple matches, and ball and cat have dissimilarities.
Thank you in advance!
Welcome to KNIME Forum. If you are looking for the String Similarity node, you have to install the extension Palladian for KNIME. See the information and instruction on NodePit.
Thanks much Hans I will download it
@HansS I see that this string similarity shows the similar strings with 1 as a value, how to find dissimilarity and record them?
There is also the Similarity Search node which you may find useful, as well as a few example workflows on the KNIME Community Hub associated with ‘Fuzzy Matching’ that may give you some ideas:
Are they on the same row or how is the comparison made exactly. Maybe you can simply join the data and see what matches and what does not
@Shrinidhi if this indeed is about cells being identical (or not) why not use equals?
If it is about similarity I have this collection:
Thanks so much @mlauber71
The two strings come from two different sources, and there may also be issues like spaces, special characters between the same strings but different sources @Daniel_Weikert
@ScottF I will work more on this and tell you if it worked.
Is the challenge that these columns need to be joined by the similarity match? If so, then you may consider taking a multi-step approach using some advanced joiner components. I have to do this relatively often for forensic accounting projects trying to match up relative details between manually entered fields from multiple systems.
Often a single approach will not work for the entire match up process. In the event that you also need to match values in other fields to ensure a proper join logic you can use the component below. I often process columns for keywords and then join based on multi-column match and a “contains” approach in the fuzzy match columns.
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.