CPUUtilization high and jobs failed - AWS - knime server 4.9

Shutting down the executor (which is most easy to do by shutting down the server, for which you can use the dedicated script in the bin folder) will also release the memory that is occupied by the JVM.

if i stop executor then running workflow will stop…
its the same as reboot …

i just want suppose i am running workflow after success of workflow, memory should be released for next workflow.

Indeed, any running workflows would be stopped, so in this way it is similar to rebooting.

There is a parameter in the server configuration file, which controls how long does a job stay in memory upon completion or waiting for a user interaction on the WebPortal before being swapped to disk. The parameter is com.knime.server.job.max_time_in_memory and the default value is 60 minutes. You can set it to 1m to trigger swap after 1 minute. See the “Job swapping” section of the Server Admin Guide for details. Note, that this will release the memory internally within JVM, but the memory will not be released to the system until the JVM is up.

Best,
Mischa

Hi Navin,

As long as the job has not been terminated, KNIME may still hold on to some of the tables in memory. This is intended behavior and not necessarily a bad thing since, for instance, you might want to have a look at some intermediate or final results of the workflow using a table view, in which case it is beneficial to have the data in memory still.

KNIME 4.0 introduced a new table caching strategy that attempts to keep some least recently accessed tables in memory. While this strategy will make your average workflow execution much faster, it will use available memory more liberally and has been observed to severely increase the load of garbage collection on some systems / for some workflows. These changes could be responsible for the increased CPU utilization you are observing. For more details, see this forum post.

If you are having troubles as a result of this change, rest assured to know that we are working on taking load off the garbage collector. If you need a quick fix, you can switch to the less memory-consuming table caching strategy for KNIME 3.7 and earlier by putting the line

-Dknime.table.cache=SMALL

into your knime.ini.

Best,

Marc

2 Likes

com.knime.server.job.max_time_in_memory
DONE - but no change
tried
but the memory will not be released to the system until the JVM is up.

Dknime.table.cache=SMALL
Done -but no change

can anyone help me with
where we have set value for

Xms -
Xmx -

updating java will be helpfull ???
How to do it ?

i have set
com.knime.server.executor.max_lifetime=6h
will be helpfull ??
will it abort my running job ??

Hi Navin,

as said, JVM will not release the memory back to the system while the executor is up. But what we wanted to achieve with com.knime.server.job.max_time_in_memory was to release memory within JVM upon job completion and then JVM can re-allocate it within another job on the same executor.

com.knime.server.executor.max_lifetime allows to configure executor lifetime. More specifically, the docs say: " Specifies the time in minutes after which an executor is retired and a new instance is created (defaults to 1d ), negative numbers disable." Running jobs do not get killed, and the retired executor will be kept alive to let activity on it to finish. The consequence is that there is the second, “active”, executor started, when the old stops accepting jobs. The down side for you is that the old can block X GB memory and the new active one will not be able to get the memory that it needs as the total is constrained by machine resources. So at this stage I do not see it as a helpful solution.

Xmx is specified in the same file as Dknime.table.cache, that @marc-bux had recommended, namely in the knime.ini file of the executor. Reducing allowed memory that is allocatable by JVM is a good solution until one understands what else consumes memory on your machine or until the machine has more memory :slight_smile:

I do not see how updating Java would help, could you elaborate on the reasoning?

Cheers,
Mischa