I’m pretty new to KNIME - discovered it a bit of time ago, not have much chance to use it yet.
The overall idea of connecting nodes, I’m sold - it’s easier and quicker and more intuitive than writing scripts to do the same thing.
However, I do find some pretty easy opeations can be quite cumbersome.
Just now, I wanted to take a filepath (as a flow variable), extract the directory, add a new child path, and then open this file. I need about three nodes for this - I have to convert path → string, then do a string variable replacement to get the parent, then this goes out to a couple of locations to add the children, and finally convert back to path again - as the CSV reader doesn’t interpret a string as a , and IU don;t see a node to manipulate path variables innately.
In code, the solution is really as simple as string interpolation; it’s one line in nearly all languages. Similarly, as far as I could tell, renaming Row IDs involved first extracting the old ID as a column, then performing the string replacement, then emplacing he Row IDs from that column. It’d be nice if the Row ID node could handle all this in one (as there is a check-box allowing me to get rid of the old column when performing the renaming), - so if I could enter a quick call to a function to extract from a regex and return the new ID , then this would be great.
So this led me to think - what would happen if basically all entry boxes in the GUI could have literal values, or take a java expression?
One thing I started to realise last time I played with KNIME was that generally I was best trying to do things as snippets or using code-based nodes. Whilst some tasks could be done otherwise, it took too much plumbing or I didn’t know exactly what nodes I needed, when if I just wrote down what I wanted to do in code for those particular tasks, it gets done more quickly.
It’s good that nodes are granular and like atomic building blocks - but personally, I feel this can make the workflows too verbose. Maths can be similar - even compared to spreadsheets, you can express quite a bit in one line, whilst in KNIME I would usually have to break it apart more. My high-level treatment of what I need to do gets mixed in with nodes that altogether are only doing a small thing, as in this example wiuth the paths. I think being able to open up the configuration of nodes to java expressions like we have in the snippets would allow a lot more power in what can be done, without then changing what is already there massively. In this case, if I could simply tell the CSV reader to open $${someExistingFilePath}$$/../child/path/input.csv
then I’d be able to move onto the next step really quickly, but instead I have to exit out of the box and then go and find some more nodes to do what I need.
Overall I like that I can configure things really nicely in the GUI - omitting the starting lines of this CSV file, renaming headers, is all much quicker when I can click-click and it’s done, no need to save it to a variable and then start mutating things, and then going back iteratively when inevitably I got it wrong, cause I couldn’t see the result right in front of me. But there are then several scenarios, coming up for me at least, when I could express what I need in terms of flow variables etc. pretty succinctly, but from what I can see, I need a couple of nodes to achieve the same. I also think if this were possible, building it in sort of at the UI level, would mean that extensions and nodes wouldn’t need source-code changes. If KNIME’s core could handle the processing of the values - much as the Swing UI I presume must collect these values and hand them off to the extension - then it’s all black-box to the extension, it just resceived the value as an integer or a string or what have you. This is kind of what LibreOffice does in the formula wizard, and I’m sure I have seen it in other software too - where the user can click a little button and then go and enter a more complex expression or recurse down a level rather than type directly into an entry box.
Another option might be like we have in the flow variables tab. It i possible to use variables to provide some values in some nodes - so if it were possible to standardise that a bit further, and roll it out to almost all properties of nodes, then that might be workable.
Any thoughts?
Thanks!