Find duplicate value rows from a set of columns

PaulCombal · July 8, 2021, 12:29pm

Hi,

I am struggling with some logic which I can’t implement with Knime.

I have a list of customers in a table, which has 3 columns for different phone numbers as strings. Let’s say 1 row = 1 customer.

Now I suspect different rows contain the same phone number, but not necessarily in the same column. I wish to see which rows contain the same phone number.

In this example:

I would like to extract rows 1 and 2, since both of them contain the value “phone4”.

How would you do this in KNIME?

Thank you

bruno29a · July 8, 2021, 1:14pm

Hi @PaulCombal , I think this should do the trick:

Input data (same as yours):

I added a “CustomerID” column using the RowID node, just so we can identify the rows in the results. This is optional, and I assume you would have some CustomerID column of your own:

And here are the results:

We got the Rows 1 and 2 as expected.

I prefer presenting the results horizontally (from left to right) instead of vertically (one after the other), since you can clearly see which record matches with which record. Vertically, you would have to follow which lines are a set of match.

Of course if you want to do vertical results, there are different ways to add to the workflow.

Here’s the workflow: Find duplicate value rows from a set of columns.knwf (8.3 KB)

PaulCombal · July 8, 2021, 2:27pm

Very clever use of the joiner… Thank you so much for your help, I couldn’t have thought of this myself!

system · July 15, 2021, 2:27pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.