A workflow template for cumulative calculations

takbb · December 13, 2023, 10:29pm

Something that regularly appears on the forum is the need to perform cumulative calculations.

For the simple cumulative sum of a single column, there is of course the Moving Aggregation node, but the complexity arises when it is necessary for the the column that is to contain the calculated value to itself be included in the calculation for the following row, as part of a Mathematical formula other than just cumulative sum of a single column.

There is also additional complexity if under certain circumstances, the cumulative calculation is to be “reset”.

Take for example the following table:

If we wished to calculate the cumulative sum of Income vs Costs for each project, then for each row, the ProjectBalance would need to be calculated as the Income - Expense + previously calculated ProjectBalance

and additionally, on change of Project from A to B, the previously calculated ProjectBalance should be discarded.

So how can we achieve that in KNIME using JUST the standard non-scripting nodes, and no cheating (i.e. resorting to my usual favourite, the Java Snippet)

And how can we do it in such a way that we have a “framework” workflow that with only minor modifications specific to the particular data and calculations at hand, it will automatically provide us with the Previous row’s values as we iterate through the rows?

Well, it is certainly going to involve a loop, and of course because we want each iteration of the the loop contain information about what happened on the previous iteration, we will need a Recursive Loop. BUT - for those who have looked at Recursive Loops and thought “too hard”… fear not… for in this “reusable” workflow, you will not have to know how all the recursive stuff works. The aim here is that you can concentrate on the rules and the calculations specific for the requirement, and the rest will just sit there and provide you with automatic access to the previously calculated values.

I have uploaded a workflow to the hub here:

It looks like this:

and if I’ve got it right, you never have to modify anything except the items in the ORANGE BOXES. Which are your input data, and your calculations

To make life easy, all column values output on the previous row are made accessible as PREV#columnname. If you only want a small selection of columns to be made available, you can adjust the Column Filter in the yellow annotation box, but there is no need unless it is to save resources.

The Rule Engine rule in this example is this:

Which says “if the current project is not the same as the previous project, then reset the Project Balance calculation to zero.” . That’s not too difficult is it?

And the Math Formula is this:

Which says: “The current ProjectBalance is the current Income - current Expense + the previous Project Balance”

again… I think that is reasonably clear

And that is all there is too it. You have access, in all nodes placed in the Orange Rule/Calculation Section, to the value of the previous row’s source or calculated values, using the aforementioned PREV#columns and the rest of the workflow acts as a framework to provide those for you.

And just to check… what is the calculated ProjectBalance for my above table?

well… it’s this:

Have fun and please let me know if there are any cumulative calculations it couldn’t cope with.

takbb · December 14, 2023, 9:34am

I am aiming to simplify this further, so that a cumulative calculation can be added to any workflow by the addition of a few components and recursive loops.

This is my initial prototype, using 3 “Cumulative Framework” components that work together within a recursive loop. So you simply drop them onto any workflow and join them up in the pattern shown, with zero configuration to provide cumulative calculation functionality.

see also this component which encapsulates the above workflow, so you drop this on your workflow where you wish to do a cumulative calculation and then follow the steps in the component’s help description…

roberto_cadili · December 15, 2023, 8:03am

Brilliant, thanks for sharing!

system · March 14, 2024, 8:04am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.