Call workflow action - in specific order

knightyboy · March 12, 2021, 10:49am

I feel like this should be more obvious but can’t seem to find the answer.

I have a number of workflows that should be executed in a specific order as they often need a small output from another workflow to be added into the dataset (that may have taken a large number of nodes that I don’t want to replicate)

The base workflows with the outputs of interest are fairly large that take in excess of 20 minutes to run.

What I can’t seem to find is an option to call each workflow in order.

I can call a set of workflows upon completion of one of them, but they then all run at the same time, rather than workflow 1, then 2, then 3 and finally workflow 4.

Appreciate some guidance.

Many thanks

sjporter · March 12, 2021, 3:29pm

Hey @knightyboy,

If I understand your problem correctly, you want to execute multiple Call Workflow (Table Based) nodes in a specific order, and you also want each workflow to wait for the preceding workflow to complete.

If that’s the case, perhaps something like this would work for you (which uses the Wait… node):

Cheers,

@sjporter

sjporter · March 12, 2021, 3:49pm

Another option would be to take each workflow and turn it into a component, which you could then string together with other components to orchestrate the behavior at a high level.

Cheers,

@sjporter

knightyboy · March 12, 2021, 3:57pm

Thanks @sjporter

So in essence, I have to create a new workflow in order to achieve this, and can’t do anything within the execution configuration/WebPortal?

Must admit that I find that a little bemusing, it would seem to be such a simple feature.

knightyboy · March 12, 2021, 4:01pm

I am getting the following error when trying to do this

sjporter · March 12, 2021, 4:12pm

Hey @knightyboy,

So in essence, I have to create a new workflow in order to achieve this, and can’t do anything within the execution configuration/WebPortal?

Another option would be the “Call workflow action” configuration accessible when configuring an execution of a workflow on the server.

Hope this helps!

Cheers,

@sjporter

sjporter · March 12, 2021, 4:18pm

Regarding the error you saw when attemping to use the Call Workflow node, please see the related Call Workflow (Table Based) documentation:

This node can be used to call other workflows that resides either locally or on a KNIME Server. The called workflow must contain at least one of the Container Input (Table), Container Input (Variable), Container Input (Credential) or Container Output (Table) nodes, which defines the interface between the Call Workflow (Table Based) node and the called workflow in the following way [ . . . ]

knightyboy · April 27, 2021, 10:52am

Right, so I successfully implemented this but am now having other issues.

The amount of data I am now processing has got a lot larger and when I run the workflow without the Container Output (Table) node, it runs absolutely fine however when I include that node, the workflow seems to just never finish and therefore doesn’t proceed to trigger the next workflow execution.

Thoughts?

Many thanks

sjporter · April 27, 2021, 10:19pm

Hey @knightyboy,

For clarification, which option did you end up implementing? The Call Workflow (Table Based) nodes (the first example I provided) or the “Call workflow” action in the KNIME Server scheduler?

Cheers,

@sjporter

knightyboy · April 28, 2021, 8:07am

The Call Workflow (Table Based) as I need to call the tables in a specific order, not just complete 1 and then trigger a load more (which is the limitation of the other option it seems)

sjporter · April 28, 2021, 9:59pm

Hey @knightyboy,

Would it be possible to for you to share a sample workflow that mimics your use case (or a screenshot of the original)?

Cheers,

@sjporter

knightyboy · May 4, 2021, 9:06am

Here’s how I have placed the Container Output

I also tried branching it off of the main workflow with the same results though.

Then in terms of calling the workflows in order, it’s this

With the workflows that contain the container outputs selected

Not sure the End If node is required but I doubt it does any harm

sjporter · May 4, 2021, 3:33pm

Hey @knightyboy,

Based on what I can see in the top screenshot, I’d recommend sending the output from the Number To String node directly to the DB Writer node instead of sending the Container Output (Table) output to the DB Writer node. Something like this:

Could you please try that and see if it helps when you try to call the workflow via the Call Workflow (Table Based) node?

Also, you’re correct that the End IF node is not required in your workflow.

Cheers,

@sjporter

knightyboy · May 4, 2021, 5:01pm

Running this now and it’s still processing the 1st of the workflows, with the elapsed time about 3 times as long as it normally takes to complete…

sjporter · May 4, 2021, 6:52pm

If you execute the workflow separately from the higher-level “orchestration workflow”, does it run successfully? As in, does your workflow only ever hang up when it’s being called from the other workflow?

I know you said that it runs alright if you remove the Container Output (Table Based) node, but if you leave it in and run that workflow directly does it work, or are there warnings/errors that we could look at?

knightyboy · May 5, 2021, 12:08pm

It’s still running and the elapsed time is over 5 times normal so I would guess the answer would be a no.

This seems to be something caused specifically by the container output node running on the server.

I can run the workflow from a full reset on my desktop successfully.

My server does have less computing power but doesn’t struggle with running the workflow usually.

sjporter · May 6, 2021, 2:23pm

Hey @knightyboy,

Is it possible that your server doesn’t have enough memory to process the data set? Do you know the size of the data you’re processing vs. the amount of working memory dedicated to your executor?

Alternatively, have you monitored the memory usage / CPU load of the executor while the job runs via the Monitoring > Executors tab in the KNIME Server WebPortal?

The DB Writer node is compatible with the KNIME Streaming Execution (Beta) extension; you could try enabling streaming and see if that helps (here’s an article on the subject). It might also be worthwhile to add a timer to the workflow and run it locally to audit the performance of each node to see if there are any clear performance bottlenecks.

Lastly, please check out this article which gives tips on optimizing KNIME workflows.

Cheers,

@sjporter

NDekay · May 7, 2021, 1:05am

Hello @knightyboy ,

To expand on @sjporter 's mention of memory usage, what is the Xmx heap setting in the server executor’s knime.ini file? How much ram is that executor being given to work with?

And, when you log in to the Webportal and go to Monitoring>Executors as he said, what does the cpu/mem utilization for the executor look like while the workflow is running?

Lastly, if you look in the executor_workspace/.metadata/knime/knime.log, do you see any exceptions occurring during the workflow run, or any SEVERE or ERROR log lines?

Thanks,
Nick

knightyboy · May 25, 2021, 1:44pm

Have had a screenshare with Ana (KNIME Team Member) and worked out that

A) Seems to be a bit of a bug with the container output node and call workflow combo on KNIME Server 4.12.1 that was causing an issue even with the relatively small data size of the table (225MB)

B) I do not need to place the container output at the end of my workflow, so can actually put it somewhere with little data, it just needs to successfully run (circumventing the issue in point A)

Have also suggested a feature request that the Server Execution Options > Call workflow action includes an option to rank the workflows called (so one after another rather than calling all in the list at the same time) in order to circumvent the need to create a separate workflow to call others in a specific order.

system · June 3, 2021, 3:19pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.