Workflow to tool Usage

Hi,

I really appreciate the ability to integrate tools into agents via workflows—but I’m struggling to understand how to create tools directly in KNIME. Most examples show how to load a workflow, then convert it into a tool, but they don’t explain how to build the tool from scratch within the workflow itself. The issue is that these examples often rely on workflows stored in private folders, making it unclear what type of information should be delivered by the workflow to be able to be used as a tool.

Someone knows or can share a simple, step-by-step example of a KNIME workflow that’s designed from the ground up to function as a tool?.

Thanks!

2 Likes

Hey there,

I did a podcast episode / tutorial with Rosaria on this - she published it on her substack mydataguest:

I also published two beginner articles on medium:

https://medium.com/low-code-for-advanced-data-science/ai-agents-in-knime-5-5-933ac54dca84

https://medium.com/low-code-for-advanced-data-science/ai-agents-in-knime-5-5-part-2-eded98a9f57b

Maybe these resources can help get you started :slight_smile:

3 Likes

Hey Martin,

thank you very much. I managed to run a tool through the workflow, however, I wonder whether it is possible to run tools that contain Python scripts. Do you know if you have to load the Python environment somewhere?

Regards,

AG.

In general you can build whatever logic you need - config nows / workflow service input are the ways to pass data into your tool, tool message output and workflow service output are you ways to pass data back to the agent / to other tools.

Inbetween you can use whatever nodes you knime has at its disposal.

With regards to Python - there is obviously the bundled env which has the basis covered:

If you need additional python packages then you’d need to set up your own environment and either make it default for your KAP (if you are only using the agent in KAP):

or if you are using hub then you’d need to propagate the env using conda environment propagation. Downside is that this environment will be set up at run time every time so will add some latency…:

2 Likes

Thanks Martin. Yes, idea is to deploy the agent on the hub server and allow the agent to call the tools which means that the tools workflow needs to be also on the server and the python env elsewhere available also in the server if python scripts are included within the tools workflow.

I guess, the real question is, can one deploy Python scripts with dedicated Python environments in the hub server?

Cheers,

AG

Yes that is definitely possible using the conda environment progpagation method.

Follow the two steps in my last post (setting up a custom env, then using conda env propagation node to select that env and pass it to the python script node)

It is quite well described in the docs, but I also have an older tutorial on Medium:

https://medium.com/low-code-for-advanced-data-science/how-to-create-your-own-python-environment-with-knime-a-step-by-step-guide-f91ccb9c23f1

As I said the downside to using it on hub is that each time the workflow (or in your case the tool) exectues, the env will be set up from scratch (i.e. the hub does not “cache” your env, but forgets it after every run).

There should be also some examples on how to use the conda env propagation node in the workflow section on the hub:

1 Like

Man you are fast hehehe Thanks again :grin:

2 Likes

If you have the Python environment stored in a central place where all users of the hub have access to, you may also

  • set the flow variable “python3_command” of the Python script node to point to the Python executable of the Venv, f.i. “/path/to/my/venv/bin/python” and combine the string config. and Python script node to a component and share this on the Hub. In this case everyone may use the Python script node with the defined Venv without having to build it.
  • use the new Python Extension development API to develop a Python Extension and bundle it with the Venv.
1 Like

Can you actually upload a standalone venv to the hub and reference it to avoid having to build it? If so sounds like a potential neat workaround. In the context of tools there may be an additional challenge though given that there are restrictions as to how paths resolve…

1 Like

I don’t know if you can upload a venv to the hub. This’d be a very nice feature, but I doubt that you can do this.

But what I meant with “central place where all users of the hub have access to”: A central drive which may be mounted by everyone having access to the hub. F.i. if you’re using the hub for your company and you have a company-wide intranet, then each employee might have access to this central drive. In this case, deploying your Venv on this central drive will enable everyone to use this Venv.

Looking at it from a maintenance and single-source-of-truth perspective, this’d be the preferred way of sharing Python script nodes with others when you don’t want to develop a Python-based extension.

1 Like

I think conda env propagation is the cleaner solution as this does not depend on anyone having access to a specific folder - it pretty much “remembers” the requirements.txt (so to speak) and ensures that the env gets set up correctly at run time…

In my opinion, having to set up the Venv at run time is not only costly, but also risky when it comes to repeatability of the same script across multiple users, especially if the dependencies are not defined very strictly. F.i. think of the recent numpy/pandas releases >v2, where any dependency defined as something like “numpy > 1.2.78” resulted in problems when setting up a new Venv.

Thus I’d always go for a central single-source-of-truth venv wherever possible to reduce ambiguities and increase the long-term-stability of scripts/processes.