Hello
How can I group by 130 million records on 4 columns?
I want the rest of the columns first
Currently, when I do this, the heap size is full and it hangs
Is it possible to index?
Thanks for your guidance
Hello
How can I group by 130 million records on 4 columns?
I want the rest of the columns first
Currently, when I do this, the heap size is full and it hangs
Is it possible to index?
Thanks for your guidance
That is a ton of rows… Did you try the “write tables to disc” option in the memory policy tab? How much ram does KNIME have access to? Assuming that you have additional resources available in your system, then you may need to open up more ram to KNIME by adjusting your KNIME.ini file.
Therefore, the problem was not solved
Ram 64 GB DDR4
HDD 4TB
How much of that RAM is available to KNIME in the KNIME.ini file? Maybe try to up it to 50GB, or more?
-Xmx50g
Hi @mikep2020
Maybe this -GroupBy- problem can be divided into chunks using the -Chunk Loop- node to split it into different -groupby- steps less memory eager:
Doing this way, the amount of memory needed would be less per iteration and eventually less too at a second and final total grouping.
Could you please upload here a small example of your initial data and a mock small example of what you need to achieve so that we can help you further?
Hope it helps.
Best
Ael
The problem is solved, thank you dear friends
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.