Preview of KNIME Analytics Platform 5.9.0

@garethmknime it is possible there are more general problems with installation of additional packages in the nightly build under macOS / M1 (yes I am aware that this is not production code :slight_smile: ). I provided log file and screenshot (48022).

Hi Markus, please bear with us. It seems the current update site providing builds for the “nightly build” is not in a consistent state. I am going to start a new build (takes ~2h) to propagate new artifacts. Usually (… :trade_mark: … ) the official public update site is only published when it passes an internal consistency check but there must be something off with that one (will research).

(Edit: There are issues with the CDN mirroring … it has to wait until tomorrow when our experts are back online.)

1 Like

I currently seem to have problems with the Agent Prompter. Several models that seem to work before now say they do not support chat mode. Not sure if the issue about the parameter names has made the cut (yet) but it seems to work (at least with mistral-small3.1:latest).

Maybe @MartinDDDD can weight in.

Let me try and find some time to squeeze in some tests with the models still running on my laptop - is it 5.9 nightly or 5.8.1 LTS release candidate to test?

1 Like

5.9 nightly. That would be great

ok will have to download that and install extensions etc.

Just playing around in 5.8.1 with 4bn param model Qwen3 - I still think the model is “just to dumb” (due to being small) to get the parameters right. I agree that adding that numeric suffix to the parameter name will probably contribute to inferior performance.

Is this suffix required to ensure uniqueness of parameter names within the same tool or is the concern more uniqueness across all tools?

I used the Agent Chat Widget (assuming that under the hood the logic is the same) as this allows better observations as to what is going on when Tool calls etc. are enabled - results see below. I also started downloading Qwen3-vl 8bn just now so will see if that makes a difference.

Update: See at the very bottom - Qwen3-VL-8b got it done after correcting its first mistake.

I’d be keen to see some sort of KNIME Agent LLM benchmark to see how different models and their sizes differ in performance across the same agentic tasks.

Will do 5.9. probably tonight.

Qwen3-4b:

Qwen3-Vl-8b:

1 Like

Ok now in 5.9 nightly - Agent Chat Widget with Qwen3-vl-8b shows same behavior as in 5.8.1 - one failed tool cool followed by a correction and then success:

Agent Prompter ran for a while and was not able to solve it first try:


Second try with slightly more precise question it works:

I notice that in comparison to 5.8.1 the numbering of tools is gone, which is probably what the request was.

So next I tried Qwen3-4b in 5.9 and the Agent Prompter delivered (same question as in successful attempt using Qwen3-vl-8b):

And also Agent Chat Widget managed to get it done first try - I take this proves my point that the numbering of parameters that is still around in 5.8.x hinders performance of smaller models with tool calling:

@mlauber71 : Any specific tools that throw errors? I have not followed OpenAI API developments too closely recently, but can imagine that a possible explanation is that some older models were not trained on the same input structure and thus far not compatible with the structure required these days.

@MartinDDDD thank you for your explorations. This confirms that the thing with the parameter names is still there. That would mean that the use of local LLMs currently is basically limited to a few ones that do understand them. Power is not sufficient to really have a local instance running other that to experiment with it.

The corporate way currently will then be to try and access a ‘secure’ cluster of for example OpenAI that confirms to certain data privacy standards. Of source one can always use an official paid service but then you would expose your data to the wider web.

I’d summarise as follows:

  • 5.8.x still appends numerics as suffixes to tool parameters and this confuses smaller models
  • 5.9. seems to fix this and for the same workflow / tool / prompt, smaller models now have a chance

That said I think for most scenarios where agents will be deployed in production it is fair to assume that they will have multiple tools at their disposal. I tried 5.9. on my most complex set up (~20 tools) and smaller models don’t get anything right - even when given very specific instructions.

I agree that larger corporates will be able to use their “contained” AI environments (e.g. Azure) to run agents - for smaller companies that don’t have that it’d be interesting to find out what model size is required to reliably run agents (maybe even up to certain number of tools available to them) so that they can determine, assuming setting up a contained AI Env with Azure is not possible / cost effective, what size of GPU may need to rented to run Open Source models for this purpose.

I may experiment with this in the future and am happy to keep you posted in case I set up OS models to experiment and gather some data :slight_smile:

2 Likes

2 posts were split to a new topic: KNIME 5.9 Preview - List File node and path syntax

A post was split to a new topic: KNIME 5.9 Preview - DatApp Configuration Problems with Legacy Nodes under Windows

Thank you @MartinDDDD for checking it with 5.8.1 and 5.9.0.
I am glad to see the changes in 5.9.0 also brought a little improvement for these smaller models.
It’s interesting to see how the models capture the essence of the parameter names but then struggle to produce the correct variation of the name.
Perhaps adjusting the names to the ones the model seems to prefer (e.g. from selectedProfitCenter to profit_Center) can further improve reliability.

What were you referring to here? The suffix problem or that the models produce the wrong parameter names e.g. profit_center instead of selectedProfitCenter ?

@nemad this is not easy to pin down. I think the tools node will ‘tell’ the LLM what the parameter name is - I think also in the instructions there is a mention of the parameter name. For some LLMs if you use “Profit Center“ in conversation it uses “profit_center“ which is the screen name of the parameter. But internally it might be “profit_center-9” or something. And this is what local LLMs often are not able to figure out. Even if you tell it it is “profit_center-9“.

Would be great if this could be sorted out. I am aware that theoretically there could be the same names. Maybe the node collecting the tools making a warning could be an idea.

Yes, the numeric suffix did confuse models (even some of OpenAI’s models) but that quirk should be removed in 5.9 and we also validate in the Workflow to Tool node that there are no parameter name collisions because that will almost certainly confuse the model and usually points to a bug in the tool workflow.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.