Sharing Python snippets within workflow?

redm · December 12, 2018, 5:34pm

Is there a simple way to share Python code within a workflow?

I have a number of Python Script nodes, for which I wrote a set of common helper functions to make the code more simple and readable. Problem now is that I copy/paste that common code manually into all script nodes. And when I change something, I have to copy/paste it to all nodes again. This is a bit cumbersome and error prone. Would be cool if I could have a single source of truth, like a .py file in the workspace folder, which is automatically picked up/included by all snippet nodes. Is this possible somehow?
The only thing I found are templates, but they don’t really do what I want, as they always replace the entire code…

Thanks

Michael

mlauber71 · December 12, 2018, 5:53pm

You could have a Python node within a Metanode. Store that as a template either with a relative path or with a fixed path. When you want to change a thing you

disconnect the Metanode
change what you want to change
store the Metanode again as a template
next time you open a workflow that uses the template KNIME will ask if you want to update the node
you could always reuse the template anywhere you want

Save as template

choose how to store the template

you could use the templates anywhere you want

kn_example_python_template.knar (32.4 KB)

Aswin · December 13, 2018, 9:10am

Or you could store the script in a flow variable first time you use it, and use the script in that flow variable in the rest of your workflow. When you change the first python script, all the subsequent python nodes will use the new code:

redm · December 13, 2018, 12:18pm

Ah, now I know what those mysterious empty input fields in the Flow Variables config are good for!

Thanks for your replies. But unless I miss something both suggestions don’t quite solve my problem, as they always set/share the entire code of the script nodes. I only want to share a common part of the code. My script nodes all do different things, but they share some amount of helper code. Which is currently copied everywhere. What I’d like to have is something like an include or a library.

(I could probably edit .py files outside of Knime, and concat and load files into variables, and set those as contents of the nodes. But, hmm…)

Aswin · December 13, 2018, 12:35pm

Like with a normal Python program, you can put the Python functions/classes that you want to reuse in your workflow in a separate file and then import it in your Python script node. Suppose you put your functions in a file called “mylib.py”, you can import it in a Python node by starting your script with

import sys
sys.path.append("C:/location/of/my/python/file")
import mylib

(Edit: tested with Python 2.7)

redm · December 13, 2018, 1:15pm

Right, of course, good point!
Now I just need to figure out how to make that portable to deploy on Knime server (as I need a full OS path for append())…

Aswin · December 13, 2018, 1:20pm

Maybe the knime.workspace flow variable is useful here, which should contain the workspace path…

redm · December 13, 2018, 1:55pm

Yea, that was my first naive thought as well… but knime.workspace points to some strange location on the server (at least there are no deployed workflow files).
But even if that was correct, also the structure within the workspace might differ. I’d probably need the path the (deployed) workflow currently lives in, where ever that may be.

mlauber71 · December 13, 2018, 9:47pm

Have you thought about using the new KNIME_jupyter notebook function?
https://www.knime.com/whats-new-in-knime-37#jupyter-integration

From the screenshot it seems to be able to work with KNIME paths like
knime://knime.workflow/…/

So if you would store your code in a Jupyter notebook on the KNIME server and include it in each Python node via knime_jupyter.load_notebook(…)

redm · December 14, 2018, 4:12pm

That sounds interesting. But also not exactly straight forward… (never dealt with Jupyter before)

I found that I can put the .py file to load into the workflow folder (at first I was not sure if this is supported and if the file would survive in there, or cleaned simply up at the next opportinity or ignored on deployment…). And then I discovered the Extract Context Properties node with context.workflow.absolute-path, which seems to be an absolute path to the currently running workflow that I can feed to the sys.path.append() call. And that also works on the server. This seems sufficiently portable and straight forward now.

Thanks for your inspiration.

I wonder, though, why the context properties are not generally available as workflow variables, but hidden behind that crytic node, just like knime.workspace?

Aswin · December 15, 2018, 7:40am

Another idea is to put the to be reused code in a flow variable like before, but then execute it in subsequent nodes by calling Python’s eval() function on that flow variable.

Tom_Hawkins · December 20, 2018, 12:54pm

Building on the suggestion to put the shared code in a separate file and import it, it’s not much extra work to package it up as a module which you can install on the server - into the appropriate conda env if you’re using that. Then you don’t have to explicitly add its location to sys.path, you can just import mylib.