Hello there,
As an alternative to using a node that forces the GC to run or invoking the GC externally, you can also go to File -> Preferences -> General and check the box to Show heap status. You should then get to see a heap status panel at the bottom right of your KNIME Analytics Platform. Clicking the trashcan icon will then perform a full GC sweep. It should look like this:
Regarding the reported heap space exhaustion / memory leak, I can give some background:
KNIME Analytics Platform 4.0 makes a lot more use of its assigned memory for caching recently accessed data in memory. However, data cached this way are only softly or weakly referenced, i.e., they will be made available for garbage collection well before memory becomes critical.
@DiaAzul, it sounds like what you are experiencing is some proper heap space exhaustion. KNIME Analytics Platform becomes less and less responsive because more and more time is being spent unsuccessfully attempting to collect garbage and less and less time is being spent doing actual work. Eventually, Analytics Platform runs out of memory. What I can say is that this should not be related to in-memory table caching for reasons outlined above. What I can also say is that the Joiner node can use up a lot of memory but will flush intermediate data to disk when some memory threshold is reached.
Since @qqilihq brought up this forum post that discusses a potential memory leak in the Java Snippet node, I attempted to reproduce the issue on my machine. I did not find a memory leak in the Java Snippet node however. I merely found memory to be blocked by log messages in the Console View if I set the Console View Log Level to INFO or lower. See my latest reply in that other forum post for more details.
Some suggestions you could try:
- You can try setting the Console View Log Level to WARN in File -> Preferences -> KNIME -> KNIME GUI and see if that helps.
- You can configure the memory-intensive nodes (I’m looking at you, Joiner node) to Write tables to disk in the Memory policy tab.
- From your knime.ini, you could remove the line -XX:+UseG1GC and insert the lines -Dknime.table.cache=SMALL, -Dorg.knime.container.cellsinmemory=100000, -Dknime.synchronous.io=true, and -Dknime.compress.io=GZIP. Your KNIME Analytics Platform 4.0.1 will then behave a lot more than your KNIME Analytics Platform 3.7.2. Consequently, it will be a lot slower, yet also a bit less liberal in terms of resource consumption.
If all of these steps do not help, could you provide me with a minimal workflow with which I can reproduce the issue? Alternatively, you can use tools such as VisualVM to generate heap dumps and compare heap dumps with one another. This way, you can pin down what exactly is clogging up your memory.