detect and delete duplicate (rows) based on Lastest variable date and not on row ID

Mokrani · May 24, 2018, 9:26am

Dear all,
I’m looking for a solution how to deal with the duplicate rows. by deleting the row duplicate that has the old date

OutPUT need to be like that:

Bests.

gab1one · May 24, 2018, 10:04am

Hi @Mokrani
You can do this with the GroupBy node, add the columns that stay the same in the Group Column(s) table and use the aggregation method Maximum on the update date column.
best,
Gabriel

Mokrani · May 24, 2018, 10:09am

Could you send me an example to understand more? im quit new in knime?
Thanks

gab1one · May 24, 2018, 11:42am

Here you go:

GroupBy demo.knwf (6.0 KB)

best,
Gabriel

Mokrani · May 24, 2018, 12:47pm

Thank you!!
But i have a problem with rows that contains different Links as an example

OutPUT need to be like that:

Could you help about that?

Mokrani · May 24, 2018, 1:29pm

I find a solution about that by adding aggregation get the last URL (because I notice that knime can sort automatically the string date )

Thanks again.

system · May 31, 2018, 1:29pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.