Ideas for solution to join multi-column names

BSLsteve · January 12, 2026, 3:00pm

Hello,

I wonder if anyone can help me with a workflow solution. I have two sets of data that I need to join by customer names, but in one set of data the names are not consistent.

The first set has first name and last name, one in each column. The second set has the full name in the first column, starting with the last name… sometimes with commas and sometimes without.

Some of the time people will use middle names, some of the time the main contact will appear under contact 2 instead of contact 1, while other times two contacts will be shown under contact 1.

There are of course duplicates of common names (e.g. two different ‘Smiths’ and two different ‘Johns’).

I split cells on set 2 by space after using the clean string node to remove punctuation and special characters, but that presented new problems

I’ve added some dummy data below to show what this looks like, of course not all of the data will necessarily connect - so I need to do a full join showing both outers, too. But much of it will connect if I knew how.

Any help with what to do here, would be very much appreciated.

Set1:

Column 1	Column 2	Column 3	Column 4
First Name	Last Name	First Name2	Last Name 2
John	Smith
Dave	Smith
John	Jones
Robert	Brown	Jenny	Brown
James	King
Bertrand	Russell	Sarah	Russell
Deborah	Franklin	Benjamin	Franklin

Set2:

Column 1	Column 2	Column 3	Column 4
Contact 1	Contact 2
Smith, John
Smith, Dave
Jones John
Brown Campbell Robert
Foley-King, James
Russell Bertrand / Sarah
Franklin Benjamin

Set2 (split):

Column 1	Column 2	Column 3	Column 4
Contact 1	Contact1a	Contact1b	Contact1c
Smith	John
Smith	Dave
Jones	John
Brown Campbell Robert	Campbell	Robert
King	Foley	James
Russell Bertrand / Sarah	Bertrand	Bertrand	Sarah
Franklin	Benjamin

fe145f9fb2a1f6b · January 12, 2026, 3:14pm

option 1) go back to the data owner and tell them to get their ■■■■ sorted.

option 2) concat row splitters and trim down your data set to have finally a single rule per case covered. seeing those examples, i would first make everything uppercase and trim leading and trailing whitespaces because the dataset sucks and likely has more hidden surprises.

then you can start a rule based approach and nest it, e.g. splitting those that have only input in column 1, then split those that have a comma in the upper path, and likely split those that have input in the first two columns for the other path.

in the end, you will likely still have data that needs interpretation to figure out which is the first and which is the surname. A dict can help here but good luck figuring out if someones name is Paul Michael or Michael Paul

anilsknp · January 12, 2026, 3:29pm

Hi @BSLsteve below are hints/solution in the simple programing methodology not in Kinme’s.

1st step - Remove duplicate rows from the 1st set on the “First Name” and “Last Name” Basis.

2nd step - Then Break 2nd set column1 in to “First Name” and “Last Name” column and clean both the columns with the help of whichever node you like ( mainly “String Manipulation”).

3rd step - Join both the Data Sets on the “First Name” and “Last Name” both column basis. Keep in mind you must join both the data set using both columns.

BSLsteve · January 12, 2026, 4:31pm

Thanks, on step 2 - how do I get around that some have middle names? When I tried this, it split them into three columns - sometimes column 2 had a first name, but sometimes column 2 would be a middle name and column 3 had the first name

anilsknp · January 12, 2026, 5:03pm

Hi @BSLsteve use “Cell Splitter Node” and specify Delimiter space or comma.

rfeigel · January 13, 2026, 2:43am

Try using a Column Aggegator node to count the number of columns in each row. Use an Expression node to rejoin either the 1st and 2nd Columns or 1st and 3rd depending on whether there are 2 or 3 columns. Then resplit the result.

BSLsteve · January 13, 2026, 10:26am

@rfeigel Thank you, this sounds like what I’m in need of. Would you be so kind as to elaborate a little?

rfeigel · January 14, 2026, 4:16pm

Try this workflow. It cleans and organizes the names from both sets and joins them where possible. Its pretty hardwired to the data you provided and may not work if the data format changes.

rfeigel · January 16, 2026, 3:46pm

@BSLsteve Have you looked at the workflow I posted? Thoughts?

BSLsteve · January 16, 2026, 3:57pm

Ah, sorry, I did not see this - I will get it now, thank you

BSLsteve · January 19, 2026, 9:46am

It works perfectly, thank you very much @rfeigel

I have to say, this is such a kind, helpful community. I have asked a few questions on here, being a novice and all - and each time it has been met with solutions where people have gone above and beyond to help. It says a lot, I think.

rfeigel · January 19, 2026, 3:35pm

You’re welcome. Glad I could help.

system · January 26, 2026, 3:36pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.