GroupBy node GC overhead limit exceeded

C_C_85 · May 11, 2017, 12:27pm

Sorry for posting another thread about GC overhead, but none of the solutions found through the forum seems to work for me.

I'm experiencing this issue with GroupBy node, reading a table with more than 77m rows and three columns.

I'm running Knime on a Windows 7, 64-bit OS, 4gb RAM,

I've already increased the heap size with Xmx3gb in the knime.ini file, increased the number of cells retained in memory, tried all the options with memory policy

but still the node doesn't work.

Working with database nodes could help?

Any other suggestion?

Thanks in advance

Iris · May 16, 2017, 9:07am

Hi,

77m is house number :-).

what is your grouping goal? Which aggregation methods did you use?

Maybe we can use a step wise approach or a loop approach if you are reducing the data tremendously.

Sure the database nodes will do the grouping inside the database and never actually get the data inside KNIME. If you have a running database, this might be a good option.

Best, Iris

C_C_85 · May 16, 2017, 10:48am

Hi,

thanks for the reply! The aggregation method is unique count and unfortunately I cannot reduce the data,.

For the database approach, which are the steps?