Picking rows based with values that are the tenth biggest or bigger

in my table are rows with several information columns. One of these columns is an integer column. Each integer value in this column can appear more than once, so it is not necessarily unique, but it can be unique.

Now I want to take all of the rows that have values in this integer column, that are the tenth biggest values or bigger.
So I want the rows with the biggest values in there, the second biggest etc. until the tenth biggest.

Has anyone an idea how I can manage to do that?

Hopefully I was able to explain, what I want to do.

You could use the "Sorter" node and sort  descending by the column of interest.  The second step would be to use a "Row Filter" node to include rows by number (First row number =1, Last row number=10)

unfortunately the values in the integer column are not unique and I want all rows that have a value in the integer column that counts to the tenth biggest values.

So there are not necessarily ten rows to pick. So unfortunately your idea does not work for me.

Me again.  You could use a Group By node to identify the tenth biggest value (This makes the values unique).  You could then make this number a variable (Table Row to Variable Node) and use this to as a row filter node to find all rows with this number or greater.  I think this will work

how about finding the 10 biggest number and the position of that, i mean for example the biggest (12655) position (2. row)…
any idea would be appreciated…

How about a group by to get unique values, then sort them, then keep only the 10 largest ones give them a rownumber. And then join this rownumber back to the original data.

to get n rows with the highest or lowest values in a specific column is pretty simple now with Element Selector node. To have RowID or Row number in a separate column use String Manipulation node after Element Selector with substr() function on ROWID column.



but I can not find the relevent extension for element sector , after searching Active learning nothing comes up, I have already v4.0.2 , what do u think?

Ivan, is it possible to have top N as standard? Without ties it supposed to be unique values (cut top n unique) with ties (cut top n + all records with top n values).

you’ve got to divide the problem into smaller steps. @Macca already provided the initial steps

  1. GroupBy Int values only to get unique ones
  2. Sort Int values desc
  3. Apply Row Filter to get the top 10 biggest values

Then you simply use a Reference row filter to filter the initial table for the ten biggest Int values retrieved by the steps above.


Thank you @mw. I just expect node modification to support different options. Also, it could be expected as part of Group by node.

@joan to add Element Selector into KNIME there are couple of possibilities. Either Drop&Drag node icon into your KNIME Analytics Platform from link I provided or go File --> Install KNIME Extensions and find Active Learning extension after which you follow steps for installing it.

@izaychik63 currently not but in feature maybe! Will check it and get back to you.


@ipazin @mw @izaychik63
thank you guys , I made it!

just to follow up on this one.

This makes sense and feature request has been made for it.

This one brings questions what row to keep when there are ties and that logic is implemented in Duplicate Row Filter node so it doesn’t make sense to have it copied to this node as well. Simply run Duplicate Row Filter node before Element Selector :wink:



reviving this topic to add info about Top k Selector (previously called Element Selector) node which was enhanced and has possibility to return all rows associated with the top k unique values.