workflow on KNIME server been mystically disabled

Hi I have met a strange problem recently, some workflows that were scheduled running in our KNIME server are disabled due to unknown reason. I’m sure it is not manually disabled by anyone in my team. Wondering is there any reason that can trigger a workflow been disabled in backend?

Another issue happened recently is a few workflow in our KNIME server may generate large (~50 G or bigger) temp files during running and those temp were not automatically deleted, this causes the disk at “no space” status sometime. Is this issue related to the “WF disabled” issue?

Appreciate your ideas/suggestion. thanks

best,
Xiaoan

Hi Xiaoan,

Are the schedules missing or only the executed jobs/protocols? Or are the schedules still there with “Schedule Job” deactivated?

When did this occur, was it after a recent server update?

For the other issue please sent me your server log files, so I can analyze.

I will send you a PM with the instructions on providing me the logs.

Best,

Michael

Hi Michael,
thanks for the reply. The issue is schedule been disabled, the schedule is still there but not execute

I’ve found the cause of this issue, one of my scheduled job contains a bug that will create duplicate rows, this will stimulate to form a huge table and cause following “joiner” node to execute very slowly until memory bankrupt. When such case happened, the server will create a .mdmp dump file (some can reach 10G+ size), once the disk is filled full with those dump files, all scheduled jobs cannot execute due to low disk space, then somehow the scheduled jobs will be disabled automatically (I believe this is an internal protection mechanism of knime server).
After I correct the bug, everything is good now.

several learning I would like to share:

  1. keep an eye on the status of scheduled jobs, the job will dismiss if we checked “discard job after execution”, so if a job is not dismissed after execution in the server, better to look into details (by “open in job”, error messages contain valuable information but may not reveal the real issue)
  2. keep an eye on the disk space where knime server is installed. An usual decrease of free space is not normal. Temp files of a job will be automatically deleted after successful execution, but if the job cannot finish execution (remaining running in the server), it will form large temp files in the “jobs” folder in knime server folder; If a job failed to execute due to low memory reason, it may create a large .mdmp file (also a .txt log file) in “runtime” folder in knime server folder. These two folder are where we can find dump files that cause low disk space.
  3. Make sure the designed workflow will not write duplicate rows, otherwise those duplicates may stimulate to a huge table that cost too much system memory during executing. “duplicate row filter” is good to check during development.
1 Like