CSV and Similarity Search

i cant link Table to Similary Search and execute it but if I link two CSV files (nodes) to Similarity Search it failed, I cant see the fields in the inclusion.


Take a look at



THank you vey much for your reply… I tried it… it is comparing string to string in the same raw… i have a column of 500K and another column of 150 K names… i am trying to fuzzy match both of them…

so one cell probably need to fuzzy match the entire second column … … some advise is much appreciated :slight_smile:

1 Like

@Toronto825 I have compiled a list of ressources about adress deduplication. Maybe you want to check that out and see if it can help you.

Then if you have a node like String similarity - at the bottom of the page there is a list of sample workflows that would use the node so you could see how other people have used it. Often that can help you to get an idea how it works

Like for example:


The steps here will be

  1. Cross join 2 CSV files
  2. use String similarity
  3. Filter on similarity
    4 convert nodes from step 1-3 to component
    5 Stream component (streaming may need to be installed)

i am not getting good results… the names are way off…

do I need to use the counter generation node…? do you have a workflow draft that can help ?

Please look at example


This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.