Idea: Java expressions for all Node Properties?

stellarpower · November 19, 2023, 4:04am

I’m pretty new to KNIME - discovered it a bit of time ago, not have much chance to use it yet.

The overall idea of connecting nodes, I’m sold - it’s easier and quicker and more intuitive than writing scripts to do the same thing.

However, I do find some pretty easy opeations can be quite cumbersome.

Just now, I wanted to take a filepath (as a flow variable), extract the directory, add a new child path, and then open this file. I need about three nodes for this - I have to convert path → string, then do a string variable replacement to get the parent, then this goes out to a couple of locations to add the children, and finally convert back to path again - as the CSV reader doesn’t interpret a string as a , and IU don;t see a node to manipulate path variables innately.

In code, the solution is really as simple as string interpolation; it’s one line in nearly all languages. Similarly, as far as I could tell, renaming Row IDs involved first extracting the old ID as a column, then performing the string replacement, then emplacing he Row IDs from that column. It’d be nice if the Row ID node could handle all this in one (as there is a check-box allowing me to get rid of the old column when performing the renaming), - so if I could enter a quick call to a function to extract from a regex and return the new ID , then this would be great.

So this led me to think - what would happen if basically all entry boxes in the GUI could have literal values, or take a java expression?

One thing I started to realise last time I played with KNIME was that generally I was best trying to do things as snippets or using code-based nodes. Whilst some tasks could be done otherwise, it took too much plumbing or I didn’t know exactly what nodes I needed, when if I just wrote down what I wanted to do in code for those particular tasks, it gets done more quickly.

It’s good that nodes are granular and like atomic building blocks - but personally, I feel this can make the workflows too verbose. Maths can be similar - even compared to spreadsheets, you can express quite a bit in one line, whilst in KNIME I would usually have to break it apart more. My high-level treatment of what I need to do gets mixed in with nodes that altogether are only doing a small thing, as in this example wiuth the paths. I think being able to open up the configuration of nodes to java expressions like we have in the snippets would allow a lot more power in what can be done, without then changing what is already there massively. In this case, if I could simply tell the CSV reader to open $${someExistingFilePath}$$/../child/path/input.csv then I’d be able to move onto the next step really quickly, but instead I have to exit out of the box and then go and find some more nodes to do what I need.

Overall I like that I can configure things really nicely in the GUI - omitting the starting lines of this CSV file, renaming headers, is all much quicker when I can click-click and it’s done, no need to save it to a variable and then start mutating things, and then going back iteratively when inevitably I got it wrong, cause I couldn’t see the result right in front of me. But there are then several scenarios, coming up for me at least, when I could express what I need in terms of flow variables etc. pretty succinctly, but from what I can see, I need a couple of nodes to achieve the same. I also think if this were possible, building it in sort of at the UI level, would mean that extensions and nodes wouldn’t need source-code changes. If KNIME’s core could handle the processing of the values - much as the Swing UI I presume must collect these values and hand them off to the extension - then it’s all black-box to the extension, it just resceived the value as an integer or a string or what have you. This is kind of what LibreOffice does in the formula wizard, and I’m sure I have seen it in other software too - where the user can click a little button and then go and enter a more complex expression or recurse down a level rather than type directly into an entry box.

Another option might be like we have in the flow variables tab. It i possible to use variables to provide some values in some nodes - so if it were possible to standardise that a bit further, and roll it out to almost all properties of nodes, then that might be workable.

Any thoughts?

Thanks!

takbb · November 19, 2023, 8:53am

Hi @stellarpower ,

I agree with your sentiments. There are many things that KNIME makes so simple and so many simple things it makes difficult.

PATH VARIABLES
In particular Path variables, for the reasons you point out, do feel "half baked’. It should not be necessary to do the continual path to string /manipulate / string to path conversion just to adjust a path slightly. I notice this keenly when using the various community nodes that assist with formatting a spreadsheet.

Chances are, if I write a spreadsheet and then want a formatted version of it, the new one is going to have a slightly modified name, (different subfolder or maybe just a name prefix or suffix change) and further, if the workflow is to be reusable the file path will be in a variable. KNIME paths and lack of direct Path manipulation nodes make this trivial job into a minor sub-project! This is why I end up writing so many components because I don’t like having to repeat myself in every new workflow.

There are two schools of thought on whether KNIME should provide additional nodes when a job can already be performed from existing ones. I am firmly in the camp that it should, where the task is a common one and is effectively boilerplate placement of (noise) nodes, or where the task itself is trivial but building it in KNIME requires the mind of an expert or a programmer to perform. The whole point is that difficult tasks should become less difficult (ideally easyl. It shouldn’t also be making easy tasks difficult.

By the way, in the case of path transformations you may be interested in the following component:

and it’s sister, which has been adapted for a workaround to a KNIME bug (which is increasingly frustrating my use of components) so that it can work inside conditional branches:

I wrote the above for exactly that scenario of minor modifications to paths without the mass of nodes. It would be great if KNIME had a dedicated component for this.

ROWID

The particulate use case you’ve cited is not one I’ve particularly had an issue with. I tend to use the RowId node to copy out the existing rowid for some future purpose or to update the existing rowids with a new sequential list. Maybe you use rowid more extensively than I do. But I take your point that often there are things requiring multiple additional steps for one node where in a different node equivalent behaviour can be achieved by a simple configuration setting.

SCRIPTS AS PARAMETERS

On to your (main) point about being able to write scripts or expressions as parameters, yes I also think that would be a great idea. I’m not sure whether it would be Java as you have out in the title, but certainly I think a common expression language across all node parameters would improve what is already a great tool astronomically. There are many occasions where I just want to pass “variable+1” or similar and inclusion of a new node for something trivial like that is always a pain to me.

There is a section here on the forum entitled Feedback & Ideas so that such posts don’t just disappear into the many “questions/solutions” and others can also give their support or feedback. I’ve moved this post there

stellarpower · November 24, 2023, 1:57am

Glad to see you find it the same way. It does feel like it’s very powerful in some areas, and others are a lot of manual labour.

and this!

There are two schools of thought on whether KNIME should provide additional nodes when a job can already be performed from existing ones.

And I totally get it here, it does make sense not to add too much bloat, but at the same time, maybe existing nodes can be given more options - or in general, it’s always a tradeoff. Any toolkit can end up with extra interfaces on top of core functionality or overloads or what have you, to provide flexibility for different situations or ease of use. I have heard one project describe it as an 80/20 rule - there is a high-level interface that keeps things simple and convenient for vanilla situations 80% of the time, but then you can dig deeper and do it manually the other 20%. But in any case, I get it, and agree with you, if multiple users would find it beneficial then it’s probably best to add it in. C++ generally has the approach of providing things but you only take what you want and shouldn’t be burdened by it if you don;t want to use it, and I think that can apply here. Nodes can be added into the repositories as officially-supported but optional, and nobody needs to use them, so I’d err on the side that it does no harm to provide as an option.

I will check your custom nodes out though, thanks for the link!

I’m not sure whether it would be Java

You’re right, I probably should have said Java-like. I don’t think it would strictly adhere, but makes sense ot use what is already in KNIME and is close to it. AFAIK the snippet nodes are pre-processed using a text substitution language and then Eclipse runs these through the compiler as source code - but I may be wrong there. In any case, I imagine something similar could happen.

There are many occasions where I just want to pass “variable+1” or similar and inclusion of a new node for something trivial like that is always a pain to me.
Right, perfect example, you better than I could have myself. In those scenarios, using an extra maths node is a bit cumbersome and starts to litter the workflow. I guess ideally the workflow largely displays the semantics of what we are trying to achieve without going in to the specifics of what we do to implement it, if you follow it along.

Thanks for filing in the right place too!

system · February 22, 2024, 1:57am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.