Brining up Fuzzy Lookup/Matching/Joining again.
I’ve done extensive work with Fuzzy Match in Excel, and it is frustratingly near impossible to efficiently return the same results in KNIME.
String Matcher problems:
- Only provides a score for the First match, not subsequent matches.
- Scores are in distance, not Similarity.
- Only a single field can be matched. If I want to fuzzy-match company names, street addresses, city, state, postal code, country, then each individual category should independently incur its own penalties. Currently, I have to ram all these values into a single field.
- It would be great to have the ability to weight all of the match criteria and/or have similarity thresholds on each of the joins.