@robvp I once created this workflow that would group addresses without a ground truth against which to match. Maybe you can adapt that.
If you apply this it looks something like this:
You can edit the threshold which would constitute a similarity (Similarity Search – KNIME Community Hub) and maybe also configure the method.
If you set the threshold to 0.33 (instead of 0.25) the result would be this:
What you could do would be to try change the order of the words so that similar words would have other positions.
String Deduplication without Ground Truth - KNIME Forum (75366).knwf (192.9 KB)