Garbage Collector // memory overflow

Dear all,

Since I am working with the 2.x Version of KNIME I experience with some nodes the problem that I get a memory overflow because the garbage collector does not start by himself. If I use the trashcan icon by clicking on it the memory usage goes down to normal but starts rising again. I observed this problem with the substructure search of the CDK nodes (the node is set to write tables to disc) searching in ~1 Mio. molecules. When I tried to concatenate several big SDF files with the concatenate node I experienced this problem, too. In the 1.3 version of KNIME this was/is not a problem.
The java version I use is
java version "1.6.0_0"
OpenJDK Runtime Environment (build 1.6.0_0-b11)
OpenJDK Server VM (build 1.6.0_0-b11, mixed mode)

my knime.ini contains this:
-clean
-vmargs
-Xms1024m
-Xmx1024m
-XX:MaxPermSize=254m
-server

Maybe somebody has an idea whats wrong here.

best regards

Christian

Hi Christian,

This is interesting. Three comments, two questions.

Comments:

  • The garbage collector probably kicks in only when memory gets low. So if you have used 700MB out of 1024MB there is no reason for the VM to do a full garbage collection
  • The write to disc option only controls the table writing. The framework has no control over the individual node implementation (so if any of the CDK classes consume much memory, e.g. by creating gigantic arrays, the framework can't do anything about it)
  • OpenJDK is not supported by us.

Questions:

  • Does your 1.3 version use the same VM
  • Does it work (in 2.0.x) with a different VM (the default, i.e. Sun's JRE)
Regards Bernd

Hi Bernd,

Thank you for the reply.

The garbage collector probably kicks in only when memory gets low. So if you have used 700MB out of >1024MB there is no reason for the VM to do a full garbage collection

Exactly this happens (at least with the substructure search). When the the process has eaten up 1024 MB the garbage collector kicks in. I observed one time that this process stalled and I remembered that I had the same problem with the concatenation. The other substructure searches I carried out were stable so far.

Questions:

  • Does your 1.3 version use the same VM
    Yes.
  • Does it work (in 2.0.x) with a different VM (the default, i.e. Sun’s JRE)

The concatenation issue I observed with the brand new 2.0.0 version of KNIME when I switched from the 1.3 version. I will try this with the newest version (2.0.3) again to see if this problem still exists.

best regards

Christian

Hi Bernd,

It seems that the reason for the crashes is that I started via the ‘run all executable nodes’ the processing of up to 4 files at once while the other remain in the queue . This seems to be to much for the engine.
Is there a way to change this behavior so that just one node is executed while the others remain in the queue?

best regards

Christian

There is an option in the preference (“File” -> “Preferences” -> “KNIME” -> “Maximum working threads for all nodes”), which controls the number of nodes being executed in parallel (there is some more math behind if nodes implement it specific Parallelization-API it but that’s a good rule of thumb).

Let us know if that works.

There is an option in the preference (“File” -> “Preferences” -> “KNIME” -> “Maximum working threads for all nodes”)

I do not find this option in my installation. I pulled a new version of KNIME directly from the server but there I cannot find the ‘Maximum working threads for all nodes’, too. Is there a trick/starting option for a expert mode that gives you more options to play around with?

regards

Christian

I found the option already. Its directly under the KNIME point. I was searching for a branch. I set this option to 1 and it seems to work out.

Thanks a lot for the help.

Christian

GC.Collect call is discouraged because it decreases the current performance of the application. Whenever you call the garbage collector to performs a collection , it suspends all currently executing threads. This can become a performance issue if you call GC.Collect more often than is necessary. You should be careful not to place code that calls GC.Collect at a point in your program where users could call it frequently. However, if you can reliably test your code to confirm that calling Collect() won't have a negative impact then go ahead. More about......Garbage Collection

Mark