filter for maximum date per a given field

AdrianaFerro · December 7, 2021, 9:00am

Hello everyone,

I have the following data:

Sprintid	issueid	version_id	sprintenddate	releasedate
50	10	51147	9/11/2021	15/12/2021
50	11	51147	9/11/2021	15/12/2021
50	12	51147	9/11/2021	15/12/2021
51	13	51147	10/12/2021	15/12/2021
51	13	51147	10/12/2021	15/12/2021
51	14	51147	10/12/2021	15/12/2021

And I want to filter this data for the maximimum of sprintenddate per issueid.

The desired output is the following:

Sprintid	issueid	version_id	sprintenddate	releasedate
50	10	51147	9/11/2021	15/12/2021
50	11	51147	9/11/2021	15/12/2021
50	12	51147	9/11/2021	15/12/2021
51	13	51147	10/12/2021	15/12/2021
51	14	51147	10/12/2021	15/12/2021

Can anyone help me on how to achieve this?

Thank you a lot

aworker · December 7, 2021, 9:04am

Hi @AdrianaFerro

Maybe the -Duplicate Row Filter- node could do this job. This sounds a job for it

Hope this helps.

Best

Ael

AdrianaFerro · December 7, 2021, 10:58am

Hi it looks like I’m taking the duplicates but I really need to filter the date for the maximimum sprintenddate per issueid.

The example is just a very simple dataset.

Thank you

aworker · December 7, 2021, 11:04am

If you sort first by date so that “maximimum sprintenddate per issueid” is the first one appearing among the rows which are in some how “duplicated”, then you could filter duplicates based on the columns that are key for you to filter same rows.

The -Duplicate Row Filter- should keep the first row among the several ones for which you need eventually just to keep only one.

Hope this trick helps

Best

Ael

AdrianaFerro · December 7, 2021, 11:25am

Thank you so much! It worked

aworker · December 7, 2021, 11:31am

My pleasure and glad it worked !

Best

Ael

ipazin · December 7, 2021, 4:46pm

Hello @AdrianaFerro,

if your date column is of type Local Date in KNIME then you don’t need to sort prior to Duplicate Row Filter node as you can choose maximum for it.

DateMaximum

Br,
Ivan

aworker · December 7, 2021, 4:49pm

Thanks @ipazin. I always forget that the -Duplicate Row Filter- has this feature

Best

Ael

ipazin · December 7, 2021, 4:53pm

You are welcome and no worries @aworker. You don’t forget many others features
Ivan

Daniel_Weikert · December 7, 2021, 6:33pm

Interesting, for these questions I always think about groupby node first. Thanks for reminding me about the duplicate row filer and it’s options @aworker @ipazin

mlauber71 · December 7, 2021, 11:53pm

@Daniel_Weikert and you can always use a local database and window functions to deal with duplicates in a controlled way (rank, first, last). But for most purposes the new duplicate row filter should do the job