KNIME duplicate row filter cannot remove all duplicate data

Hi Everyone,
I have an issue. When I am trying to remove duplicate string data using duplicate row filter node, it cannot delete all the duplicate data. After transferring the data to excel file (using Excel sheet appender), i can remove duplicate data using excel built in function “Remove Duplicates”. How can I resolve this issue?

Regards,
Ekram

Hi @emshihab , I have used this node many times and have never encountered such issue.

Can you please share your data or sample data and how you configured the node? Or perhaps share your workflow?

And also show what the expected results for the data?

3 Likes

Due to data security issue, I cannot share data now. But i will create a sample data and share with you.

1 Like

image

I think I found the issue. Excel 365 can understand “Case Sensitive” and consider as duplicate value. But knime considers both row as unique. Can you suggest anything how to resolve this issue?

I would convert them all to the same case using a String Manipulation node and then do the duplicate row removal.

3 Likes

Hi @emshihab , convert them to lower case, and then remove duplication.

If you want to keep the original data as is, you can convert to lowercase into a new column, and then apply duplication filter on that new column, and then remove that new column after the operation

2 Likes

thanks for the solution

1 Like

also keep in mind that the “famous” whitespace characters can be a pain as well if you forget to remove them.
br

5 Likes

That’s a very good point @Daniel_Weikert .

@emshihab , you can use strip() to get rid of leading and trailing whitespaces:
lowerCase(strip($Part number$))

In parallel, you can also check for whitespaces by adding some text before and after your records:
join("XXX", $Part number$, "XXX")

image

As you can see, the last 2 records have a whitespace at the end.

After removing duplicates:
image

I put something together for you. Here’s what the workflow looks like:
image

Here’s the workflow: Remove duplicate different case.knwf (9.7 KB)

5 Likes

thank you very much for the solution.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.