Knime Server down (inaccessible with Proxy Error)

Hi all,

Yesterday, our knime server (Version 4.9.1) was cracked down. We could not access to the webportal at all ( with a Proxy Error).
We must in fact restart the server in order to have it re-work.

We saw at that moment, the CPU is in high use (we have a WF that consumes pretty much of CPU and RAM). However, in the server log, we find also this:

“https-jsse-nio2-8443-exec-11” #1758 daemon prio=5 os_prio=0 tid=0x00007f22e4008800 nid=0x3e53 waiting for monitor entry [0x00007f229e4be000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.knime.enterprise.server.application.UserState.writeState(UserState.java:154)

This may indicate a particular situation ? (For more details, I attach here catalina.log (183.6 KB) the complete server logs.)

Someone has this kind of problem ?

Hope to have more information about this to avoid it in the future.

Thanks in advanced,

Thanh Thanh

Hi Thanh,

Thank you for your question regarding this issue. After doing a little bit of research on the message you are seeing in the logs about “java.lang.Thread.State: BLOCKED”, it seems as though this is related to a memory consumption issue or even a garbage collection issue, where a large amount of memory is being consumed at once and things are being put on hold while garbage collection is happening.

I have not seen this issue before but if you could help us figure out a few things, perhaps we can get to the root cause:

  1. Can you send us your knime.ini file for us to investigate what your memory settings currently are?
  • This can be found in the Knime Executor location
  1. How much total memory does the VM you are running Knime Server on have?
  2. Could you send us all files in the apache-tomee/logs folder?
  3. Can you give us a corresponding time stamp of the last time this happened so we can match it up to the files in the logs folder?
  4. Is this the first time you have experienced this issue? If it is the first time you have experienced this issue, has anything changed on the server side or within the WF that usually takes up a large amount of memory? I would like to try and figure out if the WF was almost at the threshold of taking up all available memory, and maybe a few more things have been added, causing the WF to crash the server.

Once we get a bit more info, hopefully we can give you a definitive root cause.

Thanks,
Zack

Hi Zack @ztrubow

First of all, thanks for your attention to our problem :slight_smile:
I’ve put all the asked files (knime.ini and logs) in here .

Our machine has total of 64Go RAM and the last time this happens is on 07 September, around 17:30.
This is the second time that we have this kind of server crash (the other one is on 02 September).

Hope these information will help you with the investigation :slight_smile:

Thanks much !!!

Thanh Thanh

Hi Thanh,

Depending on how you start Knime Server on your instance, we have two options that may be able to help with this issue:

  1. If you are starting Knime via systemctl on startup, you will wanted to edit the knime-server.service and increase the memory usage for CATALINA to the following Environment=“CATALINA_OPTS=-Xmx8192M -server”, then adjust your knime.ini file and change the memory usage to 48g like so -“Xmx48g”.

  2. If you are using the startup.sh script to start Knime Server, you will want to go into the apache-tomee/bin/ folder and edit the setenv.sh script, and change the xmx settings to 8192M similar to the above instructions. You will then also want to change the knime.ini file xmx settings to be 48g per above as well.

What this will effectively do is give more memory to the apache-tomee service (webportal) so that it shouldn’t fail.

Please let me know if you have any questions.

Thanks,
Zack

Hi Zack,

Thanks for your reply, I’ll change the CATALINA_OPTS (knime.ini is already at 48g). Hope this help our server more robust with the heavy WF we have.

I’ll keep you informed …

Thanks and have a nice day,

Thanh Thanh