KNIME Random Forest Learner java.lang out of memory

Running on Windows 10 Laptop with 16GB of memory.

Work flow is very simple. CSV reader to partition. 80% goes to SMOTE and then onto Random Forest Learner. All data transform was done before KNIME and saved in the CSV.

File is 282MB, 920,000 rows and 50 columns

Estimate after 80% partition and SMOTE is 1.5 million rows.

Random Forest Learners runs up to 10% in 30 seconds and then nothing. 5 minutes later I get the out of memory error, even after telling the node to write to hard drive instead of caching in memory. Looking at Task Manager KNIME runs to about 9GB.

I changed the knime.ini file to 12GB and all the nodes to write to disc, and obviously it’s running a lot slower. SMOTE is looking like it will take an hour to run, and I’m not optimistic the Learner will run.

Any ideas?

Hi @montecarlo

You are running a classification model. I’m asking myself do you really need so many records. What is the number of 0/1 or YES/NO?
gr. Hans

1 Like

Do the smote in a separate workflow and store the results. Or try without smote which needs lots of resources. Or if you need balancing try H2O random forest which has balancing opion inlured.

1 Like

@HansS

about 920,000 falses and 1300 trues

This actually worked. Took several hours though

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.