row filter with cumulative sum condition

anupamnihal · December 14, 2012, 10:59am

Hi

I have a table sorted by 'volume percentage' as follows:

buy_trader_id volume volume percentage

3 23136 15.207

201 19046 12.519

9 13488 8.866

166 8937 5.874

31 8113 5.333

.......................... (and so on)

Now I wish to obtain traders contributing top 25% of volume. Here, they are the first two: '3' and '201' since the sum of volume percentages is 15.207+12.519 > 25%

In other words, I want to use a condition so that I get only those rows for which cumulative sum of volume percentages becomes 25%

Aaron_Hart · December 14, 2012, 1:53pm

I would probably use a counting loop start with a variable condition loop end like as attached.

Iris · December 14, 2012, 4:03pm

Hi

I added a simpler example to aarons one. You can actually use the java snippet to calculate the cumulative sum of a row, and the rest is filtering rows.

Cheers, Iris

cumulative_sum_loop.zip

richards99 · January 8, 2013, 10:19pm

I have wanted to do such tasks previously too. It would be much easier if it was possible to have an additional feature in the groupby node which you can aggregate by a specific percentile. I.e. the 10th and 90th percentile are always handy for getting the range of the majority of the data population,

in this case you could request the upper quartile, 25th percentile.

any chance of adding this feature.

simon