How to select row with most recent date (unique_id has multiple occurences)

Hi All,

I have the below file:
image

As you can see, the unique_id appears twice in the screenshot as the rmad field has been updated. What I need is to have logic in place which will tell KNIME to keep the row with unique_id etc with the most recent rmad date. So in this case I would keep only the row with RMAD at 2021-06-15.

Would anyone know how to do this?

Kind regards,

Rutger

you could use groupby node on the unique id and then choose max Rmad (could be that using string to datetime node could be helpful before the grouping)
br

3 Likes

Hi @Daniel_Weikert,

Thanks for the reply!

I converted RMAD to date&time, however when I perform the groupby the table structure changes and will only hold the aggregated columns I included in the config. What I need is for the table format to stay the same, but to only have the max date included.

Kind regards,

Rutger

I can obviously add all the fields in the groups settings… Thank for the help @Daniel_Weikert

The GroupBy certainly works, but the Duplicate Row Filter node also contains this functionality. E.g.:

If you’re ever presenting the workflow, non-tech users will probably understand more what’s happening with the duplicate row filter node rather than the groupby node.

6 Likes

Hello @rutgerverhaar,

as @Snowy pointed out this is exactly the use case for Duplicate Row Filter node :wink:

Br,
Ivan

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.