Memory heap space problems

CarlosF · June 26, 2020, 9:55am

Hi,
I am fairly new to KNIME and so I’m sorry if my problem looks/is trivial, but I really can’t find a solution.
After using, successfully, several of the Analytics EXAMPLES, where i would run the example first and then adapt it to my own data files, I got to one where I am getting a “Execute failed: Java heap space” message at the very last node, a Scorer. (I started with the basic “Learning with a Neural Network” example and other than adding a column filter everything else is the same).
I am using KNIME version 4.1.3

My laptop has 24GB of RAM, with a 4 cores CPU - and at setup about 12GB were allocated to the heap space.
I am using a CSV file with 477000 rows and 14 columns - nothing particularly big then.

So, to solve the problem, after some search I did 3 things:
1 - in the nodes I changed the memory policy to write tables to disk
2 - I added the line: -Dknime.table.cache=SMALL to Knime.ini
3 - increased the heap space to about 16GB
(Along the way I restarted KNIME a couple of times, of course)

Nothing changed - I still see the same error

Help?

beginner · June 26, 2020, 10:42am

rows of what? 4k images?

Half joking, I mean the data matters and you can have a ton of data in size even with just 10k rows.

I would undo 1 and 2 and 3 you can easily increase to 20GB unless your running other software at the same time.

WHat is the actual examples name? i can’t see any “Learning with a Neural Network” example. Are you using DL4J?

CarlosF · June 26, 2020, 10:54am

I am using a filtered /pre-processed data set for Boston_crime as available publicly - the file is about 33MB in size.

About the examples, I followed this path in the KNIME Explorer:
EXAMPLES -> 04_Analytics -> 04_Classification_and_Predictive_modelling-> 02_Example_for_Learning_a_Neural_Network

I am indeed running other software at the same time but at least, for a one-off experiment, I will do as you suggest. But I don’t think this is idea going forward l. I’ve processed the bigger, non-filtered, version of the file in a Jupyter notebook, running Python, for the same NN approach and it worked without a problem.

CarlosF · June 26, 2020, 2:21pm

OK, I tried with the 20GB value and even after closing all apps at the end KNIME just crashed out without a warning. With 19GB I got the same error as before.
So, … no.

ScottF · June 26, 2020, 7:02pm

Hi Carlos -

Can you share your current version of the workflow? It would be useful to see what your target and features are, along with what you’re filtering. Then I can try to reproduce the problem.

CarlosF · June 26, 2020, 8:34pm

Hi Scott,

Please see the workflow
Crime_Boston_garden_Learning_a_Neural_Network.knwf (152.2 KB)

Here’s a link to the data file:

stelfrich · June 29, 2020, 8:22am

Hi @CarlosF,

Thank you for sharing the workflow and data with us! I can confirm the Java heap space fills up during the execution of the Scorer node. Your input data contains the class column as Number (Integer) values, which makes the RProp MLP Learner learn a regression model instead of a classification model. Therefore, the Prediction column in the output of the MultiLayerPerceptron Predictor contains >100K classes, which the Scorer tries to compute the confusion matrix and fails…

Changing the type of your class column (and computing the domain of it) forces classification mode and stops the Scorer from eating up all the available memory. Here’s how that would look like:

Best,
Stefan

CarlosF · June 29, 2020, 11:28am

Hello Stefan,

You are right - the problem was the fact that the Class was indeed a number (although it is just a code, but a number nonetheless).

So, the whole thing worked - many thanks!
In general terms I am still a bit puzzled by the need of the Domain Calculator, but that is because I never saw that “concept” applied to my previous tools (mostly around the Python eco-system).
Is this something specific to how KNIME works?

All the best,

Carlos

ipazin · June 29, 2020, 12:21pm

Hi there @CarlosF,

Think so.

See here for more about domain in KNIME:

Welcome to Community!

Br,
Ivan

CarlosF · June 29, 2020, 1:47pm

Hello Ivan,

OK, understood, I’ll know better for next time.

Many thanks for everything - this is a great and useful piece of software and a equally great community behind it, well done everybody!

Take care,

Carlos

system · July 6, 2020, 1:47pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.