Cumulative sum with a condition

Hey all,

I need something like a moving aggregation sum but aggregated for every project and not for every row. It should look like this:

invoice number — invoice amount — Project — Sum of project (-> Column i need)
1000 ---------------- 100€ ----------------- AAA ------ 100€
1001 ---------------- 200€ ----------------- BBB ------ 200€
1002 ---------------- 150€ ----------------- BBB ------ 350€
1003 ---------------- 100€ ----------------- AAA ------ 200€
1004 ---------------- 50€ ------------------- BBB ------ 400€
1005 ---------------- 50€ ------------------- AAA ------ 250€
In the end I need to be able to tell which invoice exceeded a certain project sum, e.g. project AAA exceeded the sum of 180€ at invoice number 1003. Hope someone can help!

Regards,
Tom

One solution is to perform the Moving Aggregation inside a Group Loop, with the Project column as group.

1 Like

Group by Project and Left Join the result back to the original entries.

kn_example_group_project_sum.knwf (23.7 KB)

Thanks for the reply but I need the cumulated project sum for the ‘current’ invoice in each row. So in row1 the sum should be 200 because of the invoice 1001 for project BBB. In row2 it should be 350 because the sum of the invoices is 200+150 = 350. (I need the sums described in my first post)

Edit: I added 2 more examples in my orginial post to make it clear

Hi @tomekk,

Check this out:

Output:
Cumulative%20sum1

I have used a “Counting Loop Start” node to loop over the table using the number of the rows.
Then 2 row filters, one to include the current row and one to include rows up to current row. Then GroupBy on current row project and filter out the other rows and then a “Column Appender” to append the generated column (sum) to the current row and end loop.
Cumulative sum.knwf (48.4 KB)

Please let me know if this is what you were looking for.

Best,
Armin

3 Likes

I think I came up with a similar idea. Using an artificial rank and then calculating a sum up to the current point.

kn_example_group_project_sum.knwf (55.9 KB)

3 Likes

This is it, thanks to both of you @armingrudd @mlauber71! Far more complex than I thought it would be, comparing it to my simple SUMIF function in Excel.

2 Likes

Hi @tomekk!

Don’t get discouraged! It can be pretty simple with KNIME. Like this:

GroupCumulativeSum

In case I haven’t missed something of course :slight_smile: And don’t hold it against @armingrudd and @mlauber71. They like KNIME nodes so much they can’t get enough of it :smiley:

Now you can use some rule based node to see which invoice exceeded a certain project sum :wink:

2019_01_23_Group_Cumulative_Sum.knwf (15.8 KB)

Have a nice day,
Ivan

16 Likes

By the way, does the Group Loop guarantee that the row order stays the same? If not it may need a Sorter between the loop start and the aggregator.

Thanks Ivan, your solution seems to work as well. I guess this is what @Aswin was suggesting in his first answer. The row order does not stay the same so it indeed needs a sorter.

Best,
Tom

Hi!
It is sorted inside each group. If row order of whole data is important it can be performed after the Loop End node using RowID for example…
Br,
Ivan

Moving Aggregation”, what a great node. Thank you so much @ipazin. :smiling_face_with_three_hearts::smiling_face_with_three_hearts::heart_eyes::heart_eyes:
@tomekk only has to add the ID column at the beginning and sort the rows based on the ID after the Loop End node.
I have already said that the KNIME forum is a public school that no one should disregard!
https://blog.statinfer.com/a-public-knime-school-that-no-one-should-disregard/

Best,
Armin

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.