row filter with cumulative sum condition


I have a table sorted by 'volume percentage' as follows:

buy_trader_id     volume                   volume percentage

3                           23136                   15.207

201                       19046                    12.519

9                           13488                    8.866

166                        8937                     5.874

31                          8113                     5.333

.......................... (and so on)

Now I wish to obtain traders contributing top 25% of volume. Here, they are the first two: '3' and '201' since the sum of volume percentages is 15.207+12.519 > 25%

In other words, I want to use a condition so that I get only those rows for which cumulative sum of volume percentages becomes 25%

I would probably use a counting loop start with a variable condition loop end like as attached.


I added a simpler example to aarons one. You can actually use the java snippet to calculate the cumulative sum of a row, and the rest is filtering rows.


Cheers, Iris

I have wanted to do such tasks previously too. It would be much easier if it was possible to have an additional feature in the groupby node which you can aggregate by a specific percentile. I.e. the 10th and 90th percentile are always handy for getting the range of the majority of the data population,

in this case you could request the upper quartile, 25th percentile.

any chance of adding this feature.