I have a set of two columns, the first column named “Items” contain values that repeat themselves, the second column “Counter” counts how many times each Item repeats. (From 1 to N).

I need to find a way to filter out the last two iterations of each item, and so far, I’ve come to two possible solutions:

Proposal 1: For each item find the two highest values of their count and mark them to then filter them out with a Row Filter node.

Proposal 2: Reverse the count for each item so instead of going from 1 to N, goes from N to 1, and then filter out the column “Counter” for everything that matches “1” and “2”.

This is where I’m stuck. Any ideas are truly appreciated.

hi @mcrisnidh
my proposal, using k filter nodes.
You always want to remove the last 2 no matter how often they appear?
Then you could first use a rank node with grouping based on item and ranking on counter descending and then use a row filter node with range bound checking to filter out rank 1 and 2

