Memoryleak?!

WolfiG · February 5, 2018, 6:05am

I have been toying around with Knime a bit and ran into the following problem with the attached workflow (which might be not the best approach to my problem but it's the best I could come up with since I couldn't get any help on my questions in the forum).

I have a DB table with 10 million records. What I'm trying to do is to group the records and than iterate through each row in order to modify some of the attributes based on values within previous rows. First of all the algorithm is sluggish. To process the first 1 million records took more than 8 hours. When I checked the memory usage after 8 hours it showed that Knime was using 11GB although I'm only loading a chunk of the data into memory at a time. After restarting Knime it was down to 1.6GB (which I find is a lot as well). Anyhow - starting the algorithm it slowly started to increase again. Is this a sign of a memory leak? In any case neither the poor performance nor the excessive memory usage seem to be justified for the relatively simple workflow.

I re-wrote the algorithm 1:1 in Python using Jupyter Notebook, on the same computer I'm running Knime and I was able to process all 10 million records within 40 minutes (also not something to write home about but quite a difference anyway).

Cheers

Wolfgang

createAllGanttCharts_2.zip

RolandBurger · February 8, 2018, 12:24pm

Hi Wolfgang,

One way to speed up your workflow is to use a Parallel Chunk Loop rather than the "standard" Chunk Loop.

Regarding the memory leak: Where have you checked the memory usage? The memory usage shown by your OS is actually not a good indicator. Once Java has requested memory from the operating system it won't give it back. Therefore seeing a constant increase of used OS memory is normal behavior. Even if no workflows are running, the process still has the memory in use.

With Java all that matters is the Java heap usage which can go up and down. There is only a problem if the Java heap usage doesn't go down any more.

If you suspect that this is happening, a first step for investigating this should be to monitor heap usage in jvisualvm. Can you please run this for some time to see if the heap space doesn't get freed up after some time?

Cheers,

Roland