Partitioning absolute number of rows from top slow

Hi,

I have a sorted table with 1M rows. In the next step I want to separate out the top-1000 rows with the Partitioner node. This takes a long time, and it seems the node goes through all the rows. Why would this be necessary if you choose an absolute number of rows to be taken from the top? The row Splitter node seems to behave in a similar fashion, it goes through all rows as well.

Just curious/Evert

Look if

works better.

3 Likes

Brilliant…takes the blink of an eye.

Thanks!

Best/Evert

1 Like

Hello @evert.homan_scilifelab.se,

both Partitioning and Row Splitter node also create second data set (around 999.000 rows in this case) while Row Sampling or Row Filter don’t and would say that is the reason behind execution difference.

Br,
Ivan

1 Like

The Sorter node is also quite slow, presumably because it needs to compare all rows. iI there an alternative node that you know of?

Thank you for you explainations/Evert

Hello @evert.homan_scilifelab.se,

check out Top k Selector node. Node description states that the implementation of this node is more efficient than “Sorter + Row Filter” combination .

Br,
Ivan

3 Likes

Thanks, will try!

Cheers/Evert

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.