Is there a best practise on handling large workloads on the KNIME server (DEBIAN)?

Hello everyone,

I would like to ask the experts out there for a best practise way on how to handle extensive workloads on the KNIME Server (Debian).

  1. The issue is that we more and more run workflows which use the whole CPU capacity of the KNIME server. The issue I face is that as an admin I can’t control if a workflow run by a user will block the server for a long time, like a day or a week. What I have in mind is, if a workflow runs longer than a certain amount of time it will be canceled with a time up warning. Best would be to define this run time per workflow.

  2. This problem gets even more specific if we run R code inside of the workflows, sometimes via our self designed KNIME nodes, which communicate with R via Java library (Rosuda REngine). If I cancel the workflow it looks like that the R process is still running (HTOP) until its natural end.
    What we need is to immediately stop the R process if a certain time limit was reached.

I hope you have somehow solutions to these issues.

THX a lot

Lars

Hi Lars,

I can’t comment on point 2, but in the case of point 1 it should be possible to do this by setting up a sort of ‘watchdog’ workflow that runs on a schedule. It would be executed e.g. every 5 minutes, and would list the workflow jobs running on the server. The default format of the jobs contains the start time, and using the date-time nodes you have access to the current time, so it would be possible to flag those long running workflows, and either email users to kill them, or automatically kill the jobs. We already have returning information about job execution time as part of the REST API. The logical next step would be for us to add a feature like you describe to the AdminPortal, but we won’t be working on that for the next release.

Best,

Jon

Thanks Jon for your quick answer. The solution you mentioned was the only one I could think of and I am glad to hear that this would be best practise. Just to inform others, I have read somewhere, that there is no option to cancel a workflow via REST API, but the delete option should cancel a workflow first, which is important for us to first stop other services like a running R process.

If anyone can contibute to the problem of how to remotely stop R under Linux mentioned in 2. - help is appreciated.

Best
Lars