server jobs not visible/accessable from client

tgoedeke · February 13, 2020, 8:23am

Hello,

we are running knime server headless on a Linux VM.
Recently we had some issues with the disk space (too full) and got “workflow unexpectedly removed…”.
We fixed that issue, but since then, from the knime analytics plattform (running on windows, different users, different versions, all have the same problem) we can’t see any workflows that run on the server anymore.
All we have is this for an endless time:

The workflows that are started are still running and you can see them from the web portal.
But to open a job is impossible now; that’s quite inconvinient/impossible to debug in some cases.

Has anyone any insight?

best regards,
Torsten

hornm · February 13, 2020, 2:08pm

Hi Thorsten,

this is very likely a known issue which has been fixed meanwhile. The fix will be released with the Server 4.10.2.

It is caused by one (or a few) corrupt job(s) (likely, though not exclusively, caused by a server crash). That is, if you delete those corrupt jobs all should be back to normal again. However, since you probably can’t tell what jobs are causing the problem you might need to delete all jobs.

How to delete the jobs? Either via the WebPortal or via the Analytics Platform with ‘Use REST’ not checked in the configuration of the respective mount point.

Hope that helps. If not, please let me know.

Best,
Martin

tgoedeke · February 13, 2020, 3:15pm

Hi Martin,

thanks for the quick reply.

I unchecked the box ‘Use REST’ and reconnected to the server.
I got this error:

In the web portal, when I go to administration and jobs, I get this:

Now I am starting to get concerned.

best,
Torsten

hornm · February 13, 2020, 3:31pm

Hi Torsten,
why you can’t access the server with the ‘Use REST’ turned off I can’t explain at the moment.

The other problem at the administration page of the WebPortal is expected because it is the same cause. With deleting jobs via WebPortal I meant to right-click on a workflow in the repository-tree and select “Discard all non-running jobs”

Does that work?

(alternatively you can also shutdown the server, delete all jobs (i.e. directories) in the [server-workflow-repository]/jobs/ folder and start the server again)

tgoedeke · February 13, 2020, 5:31pm

Hi,

I deleted all the non running jobs from within the webportal.

I am not sure which one it was, or why - but I can see the running jobs from within the analytics plattform again and also open old jobs, just like it should be.

Everything seems fine for now.
Thanks for your help!

best regards,
Torsten

system · February 20, 2020, 5:31pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.