Unique binning challenge

Thanks for clarifying @badger101 ,

Your table B has an incorrect cumulative aggregation :wink: as it has only added 400 from V to C instead of 500, but apart from that my results match. The last bin should be 620 and not 520.

My workflow doesn’t use any loops or scripting nodes and is suitable as long as your quantities don’t get excessively large, because it actually does it by generating a table with a row for every item, then simply grouping them into 1000s. So this will work easily provided you don’t have huge quantities.

If you do have a much larger data set where the total quantity across all rows becomes excessive (in terms of memory requirements) for this approach, it can potentially still be used, in a recursive loop where it would process only so many rows at a time, or maybe every time the cumulative aggregation exceeds say 50,000 it would process those rows and then carry the “remaining bin” over into the next group of rows. If you need further ideas on that (if it is necessary) I can have a further play with it.

Anyway, for the current “simple” scenario, please see this demo workflow on the hub:

(KNIME 5.3.2)

edit: I just tested the theory with a total cumulative quantity of 16 million, and whilst it slowed a bit (especially on the final groupby), it didn’t particularly have issues, and still got there relatively quickly.

3 Likes