Debugging Large Loops with Breakpoints

takbb · October 29, 2023, 5:50pm

How do other people go about debugging loops with KNIME? It is something that has puzzled me for quite a while.

Sure you can manually pause execution and then “step loop execution” but what do you do when you have a loop of around 10000 iterations, and you know that something is occurring at about iteration 7000, so you want to stop the loop at iteration 6999 and then step it onwards so you can see what is happening?

You can add row filters to narrow down your dataset, but how do you do it without changing anything about the data that is being processed. How do you get it to stop at iteration 6999 and then step through the processing so that you can manually inspect what is happening?

This is where I sometimes find myself, and perhaps I am missing something with the Breakpoint node, but once it has “broken” the flow, I have yet to see how to “continue” the flow as you would with a traditional breakpoint in a typical programming IDE.

Let’s take an example of a smaller loop using the example breakpoint workflow on the hub, which demonstrates that a breakpoint can be inserted to stop at iteration 15 on a 1000 iteration loop.

Yes, it breaks at iteration 15, and I can check the data/flow variables in the loop up to that breakpoint but how do I debug “what happens next”? If I am missing something, then sorry for wasting your time, because the rest of this will turn into an academic exercise

But if, for the sake of argument, there is no way to restart the flow from that breakpoint, then stay with me because perhaps this will be useful

I’ve been thinking for a while that it would be good to stop a workflow programmatically, when a given condition occurs (e.g. the 6999th iteration), but then be able to continue the loop (e.g. step it onwards from that point, or just simply do a few checks and then allow it to continue on its way up to iteration 10000).

In the absence of finding a way to make this happen with the breakpoint node, I decided there was nothing for it but to have a think about how this could be done using the standard nodes. You may have already guessed that this came about through a very specific need that had arisen in a workflow I was debugging for work that contained such a loop.

Here is the result of my attempts at making this happen, in the form of a component I have just put on the hub:

This component doesn’t modify any data, and contains no configuration. All it does is halt the workflow. The idea is that you put it on a conditional branch using something like a Java If node, to handle the conditions for it to “Halt”.

(At some point I might add config to give it a sister with similar “powers” to the breakpoint node, but for now it does the job intended)

Let’s look again at the workflow from “Examples.06_Control Structores/04_Loops/17_Usage_of_Breakpoints_in_Loops” as an example:

Here it shows the use of a standard Breakpoint node, to stop the loop at iteration 15, which it does very successfully. However, as the annotation on the Loop End node says, the end is never reached because once stopped, the breakpoint remains in a “failed” state. What I want is to be able to somehow “un-fail” it after some manual debugging and inspection, and allow the loop to complete.

So here is the same workflow but with a conditional call to my component:

(NB The “deprecated” nodes remain simply because this is an old example, but I wanted to write an exact equivalent)

Here, a Java If contains a condition that directs the flow to the lower branch on iteration 15, and also on iteration 150.

if ($${IcurrentIteration}$$ == 15 || $${IcurrentIteration}$$ == 150 )
{return 1;}
else
{return 0;}

The image shows the workflow stopping at iteration 15

but the cool thing is that I can then click on the Loop End node, and step the loop on… something that I am seemingly unable to do with the Breakpoint node.

and if I step it twice (once to re-commence execution of the current iteration and the second to go onto the next iteration), you can see the loop continues and now arrives at iteration 16…

I can also resume loop execution, and it continues until the next breakpoint condition is reached at iteration 150

Like I said at the beginning, if there was already a way to do this, I never found it, and this turned out to be an exercise in how such a component could be written. In this case, it uses Python to achieve the “break” in execution. It does this by generating an error condition which halts the flow.

I wanted to do it using a java snippet instead but I didn’t found a way as the equivalent code in a java snippet causes KNIME to report the error, but doesn’t actually halt the workflow. [EDIT - I realise now I could possibly have done this by throwing an Abort(), in java snippet, so I will port this to java snippet at some point. Then it will work even where python isn’t configured]

Having stopped the flow though, how does it allow it to clear the error that made it stop? Well, that was achieved by having python write a “flag” file to the local workflow temporary area. This allows the python to alternate between generating an error and not. So if executed after it has generated an error, it clears the error and continues, but otherwise it generates an error. Other nodes such as “Fail in Execution” couldn’t be incorporated because they have one job “fail” and they are too good at it!

Thanks for reading. I hope you found it useful.

Oh, here is the demo…

Debugging Loops with Halt and Await Execution Component.knwf (31.0 KB)

iCFO · October 30, 2023, 7:44am

It doesn’t seem like a component will be able to solve this one. The underlying problem is the platform execute options being limited between “execute all” and “execute step”. It would certainly be nice if we could execute x number of steps manually, instead of just picking between execute all and a single step…

I tend to drop in temporary data filters to limit the pre-loop data so that it is focused on my break point. This works ok for most loops, but it can still require a lot of run and wait with some recursive loops.

I would love to see better loop execution editing tools like execute x iterations, next iteration, previous iteration.

takbb · October 30, 2023, 8:09am

Hi @iCFO , you’re right that a component can’t solve all the issues, but actually I have some plans

The above component does in my view improve on the capabilities of the breakpoint because it can halt a workflow and allow it to be continued. It can also halt a workflow at several different iterations based on conditions in the java if, and again can be continued.

OK, so I don’t see a way of achieving “previous iteration”… but… I have got ideas on a way of implementing things like “run 20 iterations and stop” and then passing it new “configuration” on the fly without resetting the workflow (by using a separate “controller component” that passes it instructions via an H2 database, or via the file system), so that, say having had it stop, you could give it a new instruction (via the yet-to-be-written controller component) such as perform a further 10 iterations and then stop again. I hope to find the time soon to put such ideas into practice so watch this space

This is a mock-up of the architecture I’m thinking of:

A “Loop iteration controller” is placed ahead of the loop end and decides on each iteration whether to stop execution or continue.

The “Loop iteration commander” is a free-standing component not attached to the workflow so it can be re-configured and re-executed without resetting the loop. But it writes instructions into a file on the file system. This file is picked up and acted on by the Loop iteration controller, each time it is re-executed following a break. So you could have it initially configured to perform 100 executions, but then on breaking, could be “reconfigured” via the “Commander” to break on every iteration (step loop), or “allow another 500 iterations” or “run to end”, etc

iCFO · October 30, 2023, 9:24am

A free standing controller for a loop breakpoint… You sly Fox!

I use component log files like this regularly to manage settings and sync components. Never dawned on me that they might allow step loop adjustments without triggering a full re-run.

takbb · October 31, 2023, 10:38am

@iCFO , sly here

I know how you enjoy components!

It slows the loop down a bit of course, but this is to be expected when debugging, and the impact is obviously less significant overall if each iteration of the loop is already quite processing-intensive.

Probably the biggest impact on performance is the functionality it contains for configuration (such as choosing the iteration flow variable). Unfortunately the config stuff all gets executed with every iteration, but I don’t see a way around that. It’s a pity there isn’t some way of turning off config-only processing during execution.

I’ll continue to try to think of ways to improve it performance-wise, but as it stands I think it will certainly aid me as a debugging tool in future.

system · January 29, 2024, 10:39am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.