filter for maximum date per a given field

Hello everyone,

I have the following data:

Sprintid issueid version_id sprintenddate releasedate
50 10 51147 9/11/2021 15/12/2021
50 11 51147 9/11/2021 15/12/2021
50 12 51147 9/11/2021 15/12/2021
51 13 51147 10/12/2021 15/12/2021
51 13 51147 10/12/2021 15/12/2021
51 14 51147 10/12/2021 15/12/2021

And I want to filter this data for the maximimum of sprintenddate per issueid.

The desired output is the following:

Sprintid issueid version_id sprintenddate releasedate
50 10 51147 9/11/2021 15/12/2021
50 11 51147 9/11/2021 15/12/2021
50 12 51147 9/11/2021 15/12/2021
51 13 51147 10/12/2021 15/12/2021
51 14 51147 10/12/2021 15/12/2021

Can anyone help me on how to achieve this?

Thank you a lot :slight_smile:

Hi @AdrianaFerro

Maybe the -Duplicate Row Filter- node could do this job. This sounds a job for it :slight_smile:

Hope this helps.

Best

Ael

2 Likes

Hi it looks like I’m taking the duplicates but I really need to filter the date for the maximimum sprintenddate per issueid.

The example is just a very simple dataset.

Thank you :wink:

If you sort first by date so that “maximimum sprintenddate per issueid” is the first one appearing among the rows which are in some how “duplicated”, then you could filter duplicates based on the columns that are key for you to filter same rows.

The -Duplicate Row Filter- should keep the first row among the several ones for which you need eventually just to keep only one.

Hope this trick helps :wink:

Best

Ael

3 Likes

Thank you so much! It worked :wink:

2 Likes

My pleasure and glad it worked :wink: !

Best

Ael

Hello @AdrianaFerro,

if your date column is of type Local Date in KNIME then you don’t need to sort prior to Duplicate Row Filter node as you can choose maximum for it.

DateMaximum

Br,
Ivan

4 Likes

Thanks @ipazin. I always forget that the -Duplicate Row Filter- has this feature :sweat_smile:

Best

Ael

2 Likes

You are welcome and no worries @aworker. You don’t forget many others features :wink:
Ivan

2 Likes

Interesting, for these questions I always think about groupby node first. Thanks for reminding me about the duplicate row filer and it’s options @aworker @ipazin

4 Likes

@Daniel_Weikert and you can always use a local database and window functions to deal with duplicates in a controlled way (rank, first, last). But for most purposes the new duplicate row filter should do the job :slight_smile:

3 Likes