Rolling column calculation

Dear KNIMErs,

I have a calculation problem I can’t wrap my head around.

I have this table of “items” (or products if you want):

image

No I want to take the overall cost (from column ‘Cost’) and subtract from wherever is revenue. If there is no revenue (as in Row0) it should start deducting in the next column (in this example Rev2021)

Also, if the revenue in a column is not “enough” the remainder should be calculated and carried over to the next column.

Here’s what it would look like for example:
Row0 – 0 / 70 / 160 (subsctracting the 100 from the 170 which is ‘enough’)
Row1 – 0 / 27 / 70 (subtracting the first 12 from Rev2020 and the remaining 13 from Rev 2021)

Here’s an example workflow (only with the example table):

I thought about using loops and temporarily saved tables but I am not 100% sure if this is a stable / good way to do this.

Any ideas?

Thanks a lot in advance!

The hands down best way is to use the @takbb Java snippet approach. That approach is embarrassingly more efficient than a loop based approach… That being said, I don’t have his “Java snippets have long memories” tricks down yet so I still fall back to a loop based approach most often. Calculating it using a loop is certainly a valid approach and not error prone if built the right way, and it does provide a more low code flow for editing the calc within the loop. Preference?

1 Like

From a sustainability point of view, having it “low coded” in KNIME directly without the use of “high languages” like Java would be preferred. The reason for this is that I always assume that at some point in time I probably have to hand over my work. If I don’t understand what I am doing there, it will be difficult to hand over (because I really don’t like to say “believe me, it will work” until all of a sudden it doesn’t).

Yeah, I know we now have chatGPT to explain the world (including Java code) to us but I am also not sure if that is a “reliable” approach. :grin:

Long story short: I would probably prefer to use loops, do you have any idea, how I could start @iCFO ?

hi kowisoft
my simple w/flow but for handling a large number of annual revenue columns(years), might not be the most optimized approach and appologize if not as per ur expectations.
image
rgds
linux knime 5.2.x

2 Likes

Thanks a lot for your feedback and support @marzukim

Unfortunately I have many more columns than in this example and also the 0’s shouldn’t be altered.

Basically, in common language a formula should do the following:

  • if the value of the first column is 0 go to the next column
  • if it is not zero, subtract the cost from it
    – was the revenue enough?
    – if yes → calculate what is left, then go to the next row
    – if no → go to the next column and repeat the same process there

Does it have to be dynamic in terms of column names and number of “Rev” columns that the calc would need move over to? Seems like the years in the column names would change if this is going to be a regular use production tool…

Is the general table structure stable enough to reference columns by position?

I think that this can be done without a loop by performing a few multi column sums / tests and then using those results to drive an if statement to apply the changes to each column.

Hi @kowisoft , I take your point about handing over low-code rather than no-code. It all comes down to which you ultimately feel is simpler.

If for example we took you “common language formula”

  • if the value of the first column is 0 go to the next column
  • if it is not zero, subtract the cost from it
    – was the revenue enough?
    – if yes → calculate what is left, then go to the next row
    – if no → go to the next column and repeat the same process there
    and compared that with this java snippet “code”, would that in any way sway you, because that is pretty much the specification for what is written here:
// Your custom variables:
Integer costRemains = null; // set initial cost remaining

// Enter your code here:

if (ROWINDEX == 0 ){costRemains=c_Cost;} // On the first iteration, grab the COST value

int revisedRev = c_Rev; // on each iteration, initially set "revised Rev" to the current revenue value for that row

if (costRemains > 0)
{
// there is still cost remaining, so attempt to deduct from current Rev

	revisedRev = revisedRev - costRemains;
	if (revisedRev < 0)
	{
		// the cost was not fully covered by this Rev, so set this Rev
		// to zero, and carry forward the remaining cost having made it "positive" again (costRemains)
		costRemains = revisedRev * -1;
		revisedRev = 0;
	}
	else
	{
		// costRemains fully covered by current Rev, so set to zero
		costRemains = 0;
	}			
}
// output change to Rev
out_Rev = revisedRev;

Rolling Columns Deduction using java snippet.knwf (18.4 KB)

3 Likes

Rolling Columns Deduction.knwf (91.8 KB)

Here is a more standard KNIME no loop / low code solution option as well. I used a regex column rename approach to offer a dynamic adaptation for changing years in the column names.

It depends on a 3 column calculation structure. If a dynamic number of columns can be present, then it would require changes.

4 Likes

@iCFO you have my admiration! :slight_smile:

@kowisoft , maybe the component solution (built on the java snippet) is an option for you as it abstracts the low-code into no-code, and once tested you can fire and forget (and hand over!).

An Integer version

and a Double version

This replaces the java snippet, and the only additional “work” required is because of my design-decision (because processing is row-based) that the “initial value” should be supplied by a flow-variable rather than an input column.

As with the java snippet example, your data must be transformed into rows for processing, and back again:


Rolling Columns Deduction using component.knwf (38.7 KB)

2 Likes

Hello @kowisoft,

if you have a fixed number of columns you should be able to avoid loops, Java, components and overengineering by using math. You simply play a bit with max/min functions. Take a look at attached workflow.
Rolling Columns Deduction_ipazin.knwf (17.1 KB)

Br,
Ivan

5 Likes

However, @kowisoft , if it does not have fixed number of columns, you can still avoid Java, but (arguably) not the over-engineering :wink: … by using my new “hot off the press” Cumulative Framework component(s)…

Rolling Columns Deduction using Cumulative Framework.knwf (75.0 KB)

2 Likes

Hello @kowisoft
I couldn’t avoid to give a try to this challenge, when I saw it. I am a few days delayed because of lack of dedicated time for forum.

This is my view for this challenge with standard nodes; this is for an ‘unlimited’ number of columns. Once again ‘MA for the magic!’ I love this node.

Math formula nodes can be nested in one single node, I left them disaggregated for comprehensibility. I hope this can still help on time.
BR

6 Likes

Wow @gonhaddock … this worked like a charm.

I especially like the wildcard driven column selection which probably somehow deals with the “not fixed number of columns” problem.

also big shout out to @takbb and @ipazin for their approaches, this motivated me to dive deeper into understanding the script nodes as well as “component development”

One thing that was new to me is the fact that I could use if conditionals in Math Formula Nodes.

Thank you everyone for opening my eyes :slight_smile:

4 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.