Iterated Use of Knime Workflows

Hi,

Thanks for creating a great tool.

My question is if it is possible to iterate a workflow automatically? For instance, if I want to perform the same workflow on many different input files, can I do this at all more efficiently than running the workflow once, reconfiguring the input file node to point to the new file, then running again, ad nauseum...

Similarly, would a similar process be possible around an input node that is calling a SQL database with slighly different queries for each run of the workflow?

Thanks for any help.

Hi,

Iterating workflows is not easily possible right now. We are currently hiding
such type of multiple executions (or loops) within meta nodes of which there
are only a few examples in the current release (Cross Validation, Looper).
We are currently investigating how we can make this more flexible and allow
users themselves to define loops, conditions, and multiple executions.

Thanks for the quick reply.

Is there a way you can recommend to iterate a workflow using tools outside of the KNIME environment? I.E. writing another program that can access a KNIME workflow?

I'm by no means a computer programmer, but if pointed in the right direction I can usually stumble along until I get something working...

Mhmm, you could use non-documented functions, such as the BatchExecutor that takes an existing KNIME workflow and runs it without invoking the KNIME GUI. But (a) I would not really recommend that to anyone and (b) it will require a lot of little other issues such as setting up the classpath correctly, fiddling with the node setting to change e.g. the file name of your FileReader node etc.
So, I guess the answer is "no, not really".
Sorry.
As I said, we are aware of this limitation and we are currently investigating if we can do something about it or if there are other (=more natural) ways to address this. It's a bit odd since KNIME was designed to "play" with your data and this way one would use it more as a visual programming interface to be run later in batch mode, so to speak.

Thanks again.

If i wanted to play around with the Batch Executor function to see if I can get things to work out, what would have to be the starting point to be able to call it?

The only languages I've been exposed to before are C/C++/C#, but from what little I know - Java isn't too different? If I find a Java IDE and open up a new project, would I just need to have the right "includes" to be able to call the BatchExecutor function, or is it significantly more complex than that?

I'm just asking because it seems that I could get the mass of the programming done in the KNIME environment and export the workflow, then the BatchExecutor just takes the file I've saved the workflow to as an argument? With the only hiccups as you mention little things like changing the node settings (like the file name of the FileReader node); things that hopefully given enough time and patience I can figure out.

Actually, using the batch executor is not that straightforward to use. As Michael pointed out, it's highly experimental and we plan to rewrite it from scratch.

To make it short: Our current batch executor is org.knime.core.node.workflow.BatchExecutor. If you launch it without any arguments it will print a list of possible options, such as "-nosave", "-workflowFile=...", "-option=nodeID,name,value,type". In particular the "-option..." argument allows the user to specify different parameters for nodes, e.g. a different file location for the file reader (you need to know what is the node's ID, what is the name of the parameter, what type is it (String, int, double) - but if you look into the xml files of the saved workflow, it should get obvious).

If you spawn the java process, i.e. "java org.knime.core.node.workflow.BatchExecutor", you must make sure that the CLASSPATH variable contains all necessary classes (otherwise it will fail to load some of the nodes in the flow). I wrote a bash script that crawls the workspace and collects the necessary information. If you provide me with your Email address, I will send it to you.

I hope this information gives you a good start - please keep in mind that we do not really support the batch executor in our current version.

Thanks for the very helpful post. I've sent my e-mail address to you in a private message.

berthold wrote:
As I said, we are aware of this limitation and we are currently investigating if we can do something about it or if there are other (=more natural) ways to address this. It's a bit odd since KNIME was designed to "play" with your data and this way one would use it more as a visual programming interface to be run later in batch mode, so to speak.

This would be an incredibly useful feature to have in the workflow itself. I've seen features like sort of like this in some of the graphical ETL tools.

How are things coming along?

Something like the ability to create lists of files and then pass this or other user created lists to a block capable of iterating the process while passing along the list information for aggregation or storing of the resulting information.

Best Regards,

Jay

At the risk of being slaughtered if we fail to deliver - so please take this with a grain of salt. We are currently working on a new version of our workflow manager that will allow repeat executions of parts of your pipeline. This will allow for cross validation to be done more naturally, also stuff like boosting etc can be done within this framework. And, of course, one could pump several different files through a flow...
Now when is this going to be available? My optimistic guess would be v1.3 in fall/winter 2007...

A small update on the batch execution: As of KNIME 1.2.1 executing workflows without GUI is much easier (no need for extra magic bash scripts). I added an entry to our FAQ page that gives a short introduction.

Although it should be much easier now to execute flows headless, we still consider this experimental (in particular on Windows system it's a bit tricky to convince the executable to print out system messages... read the FAQ for details).

Cheers
Bernd