Hope you are doing well.
Just need your help to compare certain words, so that i can progress in my workflow.
Requirement:
Input:
Name of Vendor:
Qwerty Private Limited
Qwerty Private Ltd.
Qwerty Pvt. Ltd.
All the three name is for single vendor, but in KNIME it’s showing 3 unique name, what steps shall i use or what changes shall i do that knime would consider all 3 as same.
Hi @ravi13
The String Similarity node comes with the Paladian nodes. Additional extensions are available by enabling the Update Site in KNIME via File -> Preferences -> Install/Update -> Available Update Sites Add: http://download.nodepit.com/palladian/4.0 And then it can be installed: by selecting File -> Install KNIME Extensions (source: https://www.knime.com/community ).
gr. Hans
That is a good question. I think you need a human in the loop. You have to find out, to what percentage you can go, to be sure (to some level) it is the same company. Maybe it is a good step to first “clean” your data and enrich it with some business logic, translated to rules (as @armingrudd suggested).
gr. Hans
Still here ,involvement of human how will it benefit ?
All names are of same vendor, but one is 72.2% same & another is only 41.2% how to solve this issue, + this is only example for 1 vendor there are many more, in that situation we need to give different range for different vendor, it will become little bit tedious.
didn’t look in details but seems in such cases Index Query node is better option. Used it couple of times and was really satisfied with outcome. I can try to create example and compare workflows and results.