Stop my loop if the column sum is greater than 50% of my flow variable

Dear KNIMErs,

I am trying to mimic what my friends in IT might call a “while loop”.

Here’s my situation: I have a total value of 120. I want to loop through my table (and execute the loop body) as long as the current value is not greater than the total value of 120.

Played around with the Variable Condition Loop End Node and collecting the “currentValue” in a flow variable.

I thought I could assign this currentValue flow variable to the value field in the aforementioned loop end node, but the value - of course - is an integer while the flow variable settings for “value” only accept strings. This also surprised me a little bit, as the conditions are set as “greater than”, “equal to” etc which - from my point of view - would indicate a numerical value.

I have created an example workflow here:

Thank you for your help.

1 Like

I thought I could assign this currentValue flow variable to the value field in the aforementioned loop end node, but the value - of course - is an integer while the flow variable settings for “value” only accept strings.

You indeed can supply an integer flow variable. It’ll work the same as if you defined the value in the main configuration window.

That being said, I don’t think the loop in this workflow is doing what you’ve described here.

What is your expected output?

2 Likes

Thank you for your feedback @elsamuel :slight_smile:

Basically I want to stop executing the inner part of the workflow once a threshold has been reached. So in the example the sum of all the values in the first column is 120 and the threshold value is 60.

What I wanted to achieve is that I “somehow” track the cumulative total of each looped through value, compare it to the threshold and once it is greater than that, I would stop the execution of the loop.

I could try to write this down as “code” however, I am not sure if my less than optimal coding skills eventually would make this more confusing :laughing:

ps: saw that the original threshold value on the KNIME Hub shared workflow was wrong so I corrected that.

1 Like

@kowisoft two concepts you could try to explore would be Recursive Loops (Recursive Row Loop for Sales Forecasting - #10 by Iris, Recursive Looping – KNIME Hub) and another thing would be to store an information in a temporary table that would get filled at each iteration. Not the most elegant solution I know but I have used that at several points to store and re-use informations in loops.

3 Likes

Thank you @mlauber71

I looked at this while searching the forums but my problem is, that I would need both, the value stored somewhere AND the Variable Condition Loop End for a “dynamic” ending of the loop (basically when the sum of the values column is greater than the threshold value).

Not sure, how I can do this with a temp KNIME table…

@kowisoft it could look something like this. Collection the iterating values in a temporary file and appending that at each step of the loop. And then like you have already done collect the sum of the column and then make a decision to sop (or not). Maybe not the most elegant way…

Just make sure to reset the temporary file. One could also automate that by using a switch catching the first iteration of the loop (I have and example of that somewhere).

2 Likes

@kowisoft for the fun of it I have put in a switch that would make sure the temporary table is being reset at the first iteration step…

kn_forum_48262_while_condition_example_recursive_loop.knwf (52.9 KB)

2 Likes

I think you were very close.

Here’s my attempt at a simple workflow, which I think does what you want:

It’s basically the same as your original workflow, with the Chunk Loop Start node swapped out for a Generic Loop Start node, and a Row Filter added that takes care of the stepwise row calculations.

With the threshold set at 60, this workflow stops after processing 3 rows:
image

6 Likes

At first sight seemed a simple problem, after all if we find a running total tool ( does anyone know any?)
It could be solved in 2 steps , a running total and a filter.
Like @elsamuel I kept your skeleton , change the loop and " only …only" and a couple of tool
IS not efficient as is looping all table all the time but do the job:
here if you wish to have a look :

2 Likes

I would recommend the recursive loos with the stopping criteria.

You can end the loop if a variable has the value = true

3 Likes

Thank you all very much, but I must admit, I do not even get close to understanding what you have provided here.

The idea of @elsamuel seems to come pretty close, but

a) I must have the whole table at the end (but I guess that’s something I can do) BUT
b) I must keep original RowIDs as these - in the original file - hold product IDs, which must not be changed. Of course, I am aware, that I didn’t mention this before.

Edit: I guess, the combination of both approaches (temp_table from @mlauber71 ) and an adjusted version of @elsamuel 's workflow does the trick.

I know it’s probably not the most elegant worklfow of the world (with reading - manipulating - writing RowIds and with helper columns … ) but at the end it works.

I entered a new threshold and a few more values and it still works…

btw: I tried it with a threshold smaller than values available in the table, and voilà … had an endless loop :crazy_face: lucky me, in my real world scenario, this cannot happen (as I will have a percentage of a column total so there will always be more “values” than the threshold)

here’s the final version:

Thank you guys, you really rock!

image

6 Likes

@kowisoft in the end you will have to judge how you want to solve your problems :slight_smile: and if you are happy with it we are all happy. I would like to add these remarks:

  • your loop would as far as I see it not iterate over the lines but would fill up the loop at each step so the number of lines processed would always get larger. If you only have small amounts of data this might not be an issue (and if this is what you want)
  • then as far as I understood your request you wanted to process the chunks within a loop ‘as they are’ and be able to store more general information from within a loop. So you might be able to process groups of data or each line on its own instead of piling up lines until you reach the limit
  • question is if you could achieve the same result by having a cumulative sum (Creating a column with the cumulative sum of values in another column. Possible without scripting? - #2 by ipazin) and then just use a filter …
  • I am not quite sure about the role of the temp file. In any case I would advise against using an unconnected file writer within any loop and just (hoping) that it would do the right thing at the right time. I would suggest to always explicitly wire such tasks by Flow Variables - it might save you from troubles later :slight_smile:

And you could also save the RowID within the loop if you want (Stop my loop if the column sum is greater than 50% of my flow variable – KNIME Hub) …

image

3 Likes