De-duplication of records

How do I de-duplicate records in KNIME (specifed by multiple keys)? In addition, I require output options such as:

  • ouput the duplicate records
  • output only unique records (records which have a duplication)
  • records wich only apear once (records which do not have a dupliaiton)

 

With the group by node.

Add all of your keys as groups, and add a count to it. Afterwards you can filter the datatable for the count.

Best, Iris

How do you add a count?

In the options tab, add any of your remaining columns with the Aggregation method Count

(If you don't have a column left, you can previously e.g. with the Constant Value Column node create one)