Hey there,
without seeing your data it is a bit difficult to help with concrete advise.
I happen to recall that there was a Just Knime It Challenge on a very similar topic - here is the link to the “official” solution:
You can find the original challenge here (scroll down to Challenge 10):
More community solutions are linked in this thread:
https://forum.knime.com/t/solutions-to-just-knime-it-challenge-10-season-3/81162/26
And my own, very minimal, solution is here:
I think if you check out some of these solutions you should get a good feeling on how String Similarity can be used to address your problem.