I am working on a list of compagnies and i need to check if there is duplicates compagnies in a table.
But it’s not easy to find thoose duplicated beacause there are compagnies written in different ways.
And the list is very long, so I need to optimize the process to find duplicates.
I have tried to use the worflows of duplicated adresses but it’s not working very well.
So, if someone can help me to built the worflow to find the duplicates.
Just for example, i give you names of compagnies to show you what it’s look like.
A-2-Z Solutions
A&B
A&G
a2i
A2Z Solutions
ABSolute
Absolute Magic
AC Exchange
AC&E
ACS
ACS Shop
Active
Active Data
C3 Development
C3I
EDP
EDS
Harvey
HarveyOpolis
I had exactly the same requirement and, after trying various options, found the best solution (although it took me a while to work out how to configure it to fit my data structure) to be this example workflow:
Adapting this workflow helped me to identify a bunch of duplicate organisation records, albeit with some false positives, that it would have been hard and time-consuming to identify manually. If you combine this workflow with the xls formatting nodes, you can export the results into a nicely-formatted Excel sheet that groups potential duplicates into colour-coded blocks. Unfortunately I can’t share my workflow as it contains confidential data, but here’s an edited and non-confidential extract of the output.