Here's the scenario: I have a workflow for social media text analysis - there are several different workflows for removing words, calculate frequencies, ngrams, etc. When I have a smaller data set (such as 50K - 100K tweets), I can let it run alone, completely, without problems. But when I have a much bigger data set, like 500K tweets, there are nodes that cause problems to Knime - it starts to freeze, not responding, potential deadlocks, I've also got a message about heap limit exceeded. It forces me to close and restart Knime. My workaround is: I run the workflow until the previous node where I might have problems, save the data, close Knime, reopen and execute workflow again. I might need to this 2 times depending on the data set.
Here's what it looks to happen (not sure if that makes sense, as I'm no expert on this area...): when the above scenario happens, I usually open the task manager and the memory is at 99% (like +6GB being used). Closing Knime it looks like the memory is "free", so opening Knime and executing the rest of the workflow works. It seems like the memory is starting to acumulate while the worklow is executed.
As I said, no expert here so this might not make sense, but glad if someone can help.
How much "heap" have you assigned to KNIME? This is the -Xmx value you find in the knime.ini. And when you say you are looking at the task manager and then find the process using 99% of the (system) memory... does that mean the assigned memory is larger than the available memory? Maybe the system is using swap space (so your hard disc) and then obviously things become slow.
If that's not the problem: Will it be possible for you to share the workflow with us so that we can diagnose a bit?
Thanks Bernd! See the workflow attached. I removed the data, but I believe you would be able to find any potential problems on it.
I have no idea about how system memory is assigned to Knime, I don't understand that part! :)
I have 8GB of RAM and a i5. Not much, but I can work pretty well with Knime, and also I won't be dealing wiht any data set much bigger than the scenario reported above. So any tips you can share with me about what could I do to increase my performance, I'll appreciate it. Thank you!
Below you can find my knime.ini.
I have the same issue as @gustavo.velho. I’m forced to restart Knime in order to free the WF memory. I have posted this issuue in a seprate post. I would belive there is a bug in Knime that now allow one to free the WF memry as it done.
Apologies for having this slip – our forum did not support notifications about a year ago so I didn’t notice Gustavo’s reply. The workflow is missing the input file (some xls file) – can you provide that?
Alternatively, can I get an alternative workflow?
@malik, just to ping you and asking if you have a workflow available, which we can use to reproduce the issue.