Windows Server not clearing Temp folders

We have found that Windows Server isn’t clearing the Temp folders for all the KNIME jobs. Has anyone else had this issue, and if so, how can we resolve this issue?

Hello @Willem ,

for helping others in the Community I try to give a longer answer maybe other parts of this answer will help others with kind of similar questions:

The temp folder is where the executor processes jobs and temporary files.
Normally temporary files should be deleted when the workflow job is swapped from the executor back to the KNIME Server workflow repository.
At this point would be also my main starting point: To prevent this swapping from failing. For this purpose there is a timeout in the knime-server.config file, which is easiest to adjust via the web portal with a KNIME Server Administrator account under Administration → Configuration:

com.knime.server.job.default_swap_timeout=<duration with unit, e.g. 60m, 36h, or 2d> [RT] (See the documentation.)

Specifies how long to wait for a job to be swapped to disk. If the job is not swapped within the timeout, the operation is canceled. The default is 1m. This time out is only applied if no explicit timeout has been passed with the call (e.g. during server shutdown).

Please check how this timeout is configured in your system, I would configure it so that the swap process can be executed in any case and set the value to 15m (15 minutes) or 30m or 60m according to your need. In the other case the job is kept in the executor RAM/Java heapspace and associated temporary files are not deleted.

You can also decrease the
com.knime.server.job.max_lifetime=<duration with unit, e.g. 60m, 36h, or 2d> [RT]

Specifies the time of inactivity, before a job gets discarded (defaults to `7d` ).
Negative numbers disable forced auto-discard.

to 1-2d as a job can stay on disk after it has successfully or unsuccessfully run for up to 7 days (7d is the default if not set) without change in this setting. If you do not need jobs after they have successfully run or after they failed, you can set this to 1d and jobs “should” be removed from disk (discarded) after 1 day at maximum.

To create space on the hard disk

  • you can delete older files and folders in the temp path. In this directory you should only find files that are up to date.
    (You can set up a cron job/windows scheduled task/ or knime workflow even to go in and delete any files older than what you have the job.max_lifetime set to, as this is a way to prevent orphaned jobs from being left in the tmp directory. This way even if KNIME does not remove those old files, you have an automated backup system to delete files to keep the disc from filling up.)

  • make sure your users are checking off the discard workflow after successful/failure so that jobs do not stay in memory, as well as “inactive” on disk if they do not need the jobs after it has run.

If you want to configure another directory (bigger volume / disc space) for temporary files for the executor you can add the following line to the end of knime.ini in the installation directory with the new directory (should already exist).
-Djava.io.tmpdir=

Please give us a feedback if it helped you or feel free to contact us again if you have any additional questions!

Have a nice day!

2 Likes

Thanks for the info, but it seems the problem is KNIME isn’t deleting the folders and files in the Temp folder. Even though these settings are set, there are files and folders that are months old.

I have created workflow to delete files and folders but unfortunately even though the setting is set for Executors to access local files, we have one that simply state it isn’t allowed to access local files.

Hi @dora_gcs,

Thanks a lot for your explanation! It really helped me to understand how things work behind the scenes. I am sure other users will benefit from your explanation in the future.

So, basically, I understand that if jobs are not automatically cleared from the tmp folder it is likely because the server didn’t manage to do so in the specified default swap timeout.

This actually bring me 2 questions:

  • What could be a reason for which the swapping fails (even we set it up at 30m, for example)? Because the executor is swapping other jobs, busy with other job execution, etc.?

  • What happens if for example an executor crashes while having 10 jobs being executed / on memory? In this case, I would assume the jobs are not swapped?

Thanks!

Hello @misterhd ,

There could be many reasons why swapping fails. It can be because of high CPU load, network failure, or the Garbage Collection on any side of the communication (KNIME Server, KNIME Executor).

If an Executor crashes all the jobs kept in its memory (the ones that are being executing when the crash occurs, or the already executed ones that are still in the Executor’s memory) will be lost (“VANISHED”).
On the other hand when an Executor is gracefully shutdown then all jobs currently in memory are swapped back to disk. (They aren’t lost.) Please see more info about the Execution lifecycle here.

Hope it helps!
Cheers!

This is not answering my question. We have temp files that would date back months. KNIME Server/executors are simply not deleting the temp folders and files in Windows OS. We have to manually go and delete the Temp files and folders. We did a deletion a while ago and there were folders and files older than 6 months. We cleared over 60GB of temp files and folders, and they were all KNIME related.

Hello @Willem ,

Thank you for your feedback!

It should delete temp files in most cases. Exceptions are files explicitly created there by the user (e.g. within a Python script) and Jobs loaded in an executor that are not cleanly swapped to the server (to prevent data loss).Therefore, we generally suggest to
• increase the swap timeout
• clean up any old files (the swap timeout should help avoid new files accumulating, but not clear out old files).
Should the issue (new files accumulating and not being deleted after the job is swapped) persist, this is an indication that the executor may be under high load, or writing to a slow disk (HDD). While it is best to investigate for and resolve bottlenecks (large workflows, slow network or disk, high CPU usage), it may suffice to simply schedule a job/script that regularly cleans all files older than e.g. a week.

You can check what types of files need to be cleaned manually (e.g. all of them coming from KNIME? or are there any files created with python scripts which won’t be cleaned up by KNIME but need to be taken care of the user / or by the OS?)

If you use a separate tem dir for KNIME you can be sure only the files are coming from the executor is stored there.
This can be changed via knime.ini
-Djava.io.tmpdir=/path/to/tmpdir

or in the epf profile:
/instance/org.knime.workbench.core/knime.tempDir=/path/to/tmpdir

I hope it helps you!

Regards

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.