I’m trying to run a python script inside my KNIME workflow, which is applying LOF in order to find outliers in my dataset. When I’m executing the script from inside the editor, everything seems to be working fine. However, when I’m trying to execute the node, execution is stuck at 30%. I’m also experiencing very slow runtimes during node executions of other python scripts, compared to execution from inside the editor.
When in the editor, not all of the rows of the input table are being loaded into Python for performance/interactivity reasons (only 1000 rows by default - this can be changed via the Row limit (dialog) option on the Options tab of the node dialog). That is why the execution of the script(s) is so much “faster” from inside the editor compared to when executing the node.

The node also is not really stuck at 30%, it just does not report any progress while the actual Python script is being run. This is because the node itself does not “know” what your script does (the script essentially is like a black box to the node) and therefore cannot estimate any progress percentage during that time.
The reason why you see the 30% is because the first 30% are reserved for reporting the progress in transferring the input data from KNIME to Python (the node “knows” about that), the next 40% are reserved for the running of the script (black box), and the last 30% report the progress in getting the output data back to KNIME (node “knows” about that again). So after running the node for a while, you should notice that its progress “jumped” from 30% to 70% and is then proceeding normally until 100%.

I understand that this behavior is confusing and not satisfying at all. We already plan to come up with better solutions in the future. (For example, point out more clearly that not all of the data is being transferred into the dialog. And also to allow users to report progress themselves within the script, which will then be picked up by our progress indicator.)


