Feature suggestion: Group-based Row Filter

Aswin · September 8, 2020, 11:54am

Dear Knimers,

I often find myself needing the first or the last row of groups of rows. As far as I can tell there are currently two ways of doing this: use a Group Loop and a Row Filter inside the loop to filter the first or last row (slow), or use a GroupBy node and specify “First” or “Last” for every column or type (awkward and easy to make mistakes in case of missing values, and the rowid is lost).

I suggest a Group-based Row Filter node, where you can indicate to filter the first or the last row of every group. Possible additional features could be the middle row(s), or random row(s) of every group, and a Row Splitter based on the same idea.

Best
Aswin

Aswin · September 8, 2020, 12:10pm

Sorry I see the Duplicate Row Filter can be used for this… Forget my post

s.roughley · September 8, 2020, 12:11pm

I would add to this to be able to use aggregators - e.g. the row of each group with the min/max value - I do this so often with a groupby/rowfilter/joiner sequence

Steve

s.roughley · September 8, 2020, 12:12pm

Oh wow! It looks like Duplicate Row Filter can do my add-on request too

Steve

Aswin · September 8, 2020, 12:14pm

Maybe “Duplicate Row Filter” is not such a good name for this node, since it is much more fancy than that

ipazin · September 9, 2020, 11:22am

Hello!

Glad to see experienced KNIMErs still discovering KNIME

@Aswin Got any name suggestion?

Br,
Ivan

s.roughley · September 9, 2020, 1:37pm

To be fair (as a node developer myself!) it is often quite easy to come up with a node name which described how you think of a node which is confusing to others… In fact, I’ve sometimes looked at nodes I’ve written a year or two later and been left wondering what the name might mean it does at all . That said, it’s generally - unless a name becomes very misleading - not a good idea to start changing node names on a regular basis, as that’s a definite route to confusing everyone!

Steve

pawanmtm · September 9, 2020, 5:24pm

As we have plethora of nodes, it is general tendency to miss and then recollect either by self or by forum.

Regards,
Pavan

Aswin · September 10, 2020, 9:06am

Dear @ipazin I agree with @s.roughley that name changes should be avoided as much as possible. Perhaps the functionality of the duplicate row filter can be split into a more basic node which can still called “Duplicate Row Filter” and a more feature-rich node “Group-based Row Filter” and “… Splitter”… Just a suggestion.

Best,
Aswin

ipazin · September 10, 2020, 9:20am

Hi @Aswin,

that wouldn’t be backward compatible so don’t think that’s a way to go. Let’s see if this pops up multiple times and then take some action.

Br,
Ivan

Aswin · September 10, 2020, 9:29am

Dear @ipazin

the old Duplicate Row Filter could simply get the label “Deprecated”, or is it not that simple?

Not that I am not happy with the current situation, as long as I don’t forget the power of the Duplicate Row Filter node…

Best.
Aswin

ipazin · September 10, 2020, 10:11am

Hi @Aswin,

not really sure

Bookmark this topic

Br,
Ivan

system · March 11, 2021, 10:11pm

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.