Best practice for deleting old files from KNIME server


#1

What is the best practice for keeping the disk space on the server clean?

I navigated to the following folder:
/applications/knime_server/workflow_repository
and I see that it has three folders that are taking up a lot of disk space:

  • jobs
  • temp
  • trash

I suspect that I could do this:

  • Use the KNIME Server Administration web page to trash the jobs I no longer need.
  • Run a scheduled job on the server to remove old folders in the trash folder.
  • Not sure how/when I can clean out the temp folder.

Thanks,

Carlos -


#2

Hi Carlos,

For the jobs, you can either remove them manually, or you can let the server take care of this in regular intervals. In /applications/knime_server/workflow_repository/config/knime-server.config, you find the parameter com.knime.server.job.max_lifetime= . The default value here is 7d, meaning that jobs get discarded after 7 days. If there are too many jobs accumulating in that time, you can set this to a lower value.

Regarding the trash folder, you can of course delete its content if you’re sure you won’t need it at some later point. You could also do this via a scheduled job. This is fairly easy via the /trash endpoint of the server’s REST API. A GET request to /trash also returns the deletion date of each element, so you can e.g. delete everything older than x months.

Cheers,
Roland


#3

Regarding the temp directory. That should be empty when the KNIME Server is stopped. This may not be true if the server is not shutdown gracefully. You may want to flush the contents of the temp directory on server restart, or perhaps just at each upgrade is sufficient.

In addition to Roland’s great suggestions, it is also possible to remove jobs via the REST API, which gives you some flexibility if only certain jobs should be automatically removed.


#4

Roland,

Thanks for the information on the job setting. For the time being I will be happy to let my jobs get automatically deleted every 7 days.

Yes, a scheduled job to delete old trash files will work. I’ll set one up.

On the subject of the REST API, I was not able to find a trash endpoint. I found one for jobs:
https://my_server:8080/KNIME/rest/v4/jobs
What would be the end point for “trash”?

Thanks,

Carlos -


#5

Jon,

Thanks for the information on the temp folder. Yes, I restarted KNIME Server and the temp folder got cleaned up: good to know. But I rarely have to restart KNIME Server, so I think I’ll run a scheduled job to clean out old files in that folder.

Good to know that I can use the REST endpoint to manipulate jobs: thanks for pointing that out.

Carlos -


#6

Hi Carlos,

The API endpoint for trash is https://my_server:8080/KNIME/rest/v4/trash.

Edit: Forgot to mention that the trash endpoint was added in KNIME Server 4.6.0 It is not available in older versions.

Cheers,
Roland