How to keep one row with latest timestamp for each day?

Hi expert,

I have a table that keeps records of various values in each row per time stamp. Like below.
DateTime measure1 measure2 measure3 measure4 …
monday 1pm x y z a
monday 4pm x y z a
tuesday 3pm x y z a
tuesday 4pm x y z a
tuesday 5pm x y z a

I want to figure out how to use a combination of nodes to filter the row so keep only Monday 4pm and Tuesday 5pm. In another word, keep 1 value for day but keep the maximum of time. I want the resulting table to be below. Thanks for your thoughts.

DateTime measure1 measure2 measure3 measure4 …
monday 4pm x y z a
tuesday 5pm x y z a


Yo @dataNinja

Use the Duplicate Row Filter Node. See duplicate_row_filter_data_time.knwf (17.8 KB)
Screenshot from 2020-03-15 20-19-15
gr. Hans


Thank you! This is working for me.


Hi there @dataNinja,

How about GroupBy node with Maximum Aggregation method?


1 Like

This usecase was by the way one why we made the Duplicate Row Filter
It allows the selection of a row based on the values in other columns.


Hi @ipazin,

I tried GroupBy node first. However, GroupBy requires you to select Columns to GroupBy and select columns to use max aggregation only works on selected column(s)

Above cause problems for me since (1) I need all value column retained and (2) it does not know how to keep values from other columns where the “MAX” row is found.

Duplicate Row Filter solves my needs.


Hi @dataNinja,

I get it know. Makes sense :slight_smile:


This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.