Knime v4 unresponsive during workflow run

Experimenter · October 27, 2019, 10:41pm

Hey there.
I was a happy Knime Analytics Platform user for more than a year, had version 3.7 on Windows 7. It did freeze sometimes for few seconds on some big joins, but that was not an issue.
Now I have a new machine, newer version of same specs laptop, and Knime 4.0.0. on Windows 10. For some reason it is a disaster area now. If previously I could do something else while a heavier workflow was running, now Knime is stalling the whole system, especially currently on Word2Vec learner, even if it is 10k rows with few words long documents in each row. While workflow runs, Knime itself gets responsive only for maybe two seconds in few minute intervals, so it is even tricky to stop workflow, and, whats worse, also all the other programs, like web browser, freeze for few seconds time after time.
Are there any cures to that?
Laptop has 32G ram, 22G given to Knime, SSD with lots of space, core i7… Every suggestion appreciated.

ipazin · October 28, 2019, 3:17pm

Hi there @Experimenter,

welcome to KNIME Community after more than a year!

Well KNIME 4 uses system resources more liberally and sensibly than earlier versions. For reverting to previous version policy check this faq.

Additionally if not seen these two topics can help:

Also, if possible, sharing an example workflow would be helpful in further diagnosis

Br,
Ivan

DiaAzul · October 28, 2019, 3:54pm

Below are the relevant lines from my knime.ini file that may help:

-Xmx16384m
-XX:InitiatingHeapOccupancyPercent=70
-Dknime.table.cache=SMALL
-XX:+UseStringDeduplication

I have 32Gbytes of RAM, but typically only use 16G for Knime; I have run it up to 24G but start to squeeze the rest of the system at that point. Anything between 16G-24G seems OK.

InitiatingHeapOccupancyPercent - Default for this is 45%, but I am running mine at 70%. My observation is that a mixed garbage collection is triggered whenever old space exceeds the IHOP theshold. When a large number of long lasting objects are in the old heap space then garbage collection is continually triggered. Increasing the threshold to 70% means that more data can be stored on the heap before GC is continually triggered.

Note: Most of my data is Int/Doubles not strings. You may want to reduce the IHOP threshold from 70% if you are finding that you have memory allocation issues. When saving a table of documents / strings, Java splits the data into an array of pointers to documents/strings and then ech doc/string is an array of characters. Depending upon the size of a region in the heap, this may mean that the array of pointers is allocated directly to the old gen of the heap, and the arrays of strings are allocated to the eden/survivor space. If you set IHOP high you may be able to allcate memory for the array of pointers, but not have enough space in eden/survivor to allocate the array of chars for the docs/strings…it all depends on the individual workflow and machine, and how the application works together: all I am saying is that you need to use something like VisualVM to monitor how memory is being allocated so that you can configure the environment appropriately.

Cache=small - Turns off caching of prior tables – this had a big impact for me. It was a new/updated feature in Knime 4.0 to improve performance, but in my workflow old tables were not being evicted and so I ran into continuous garbage collection.

UseStringDeduplication - I’ve included this as it is an option in G1GC, however, I am not sure whether it is actually having much of an impact (either for better or worse).

You may also want to look at preferences to see how many threads you are allowing Knime to run in parallel. More parallel operations requires more memory. I had to scale back simultaneous threads when using H2O.

I hope that helps.
DiaAzul

system · April 28, 2020, 3:54am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.