Hello.
This is not the usual how do I increase ram etc question. This is much more specific ;) We are loading CSV files with about 20 Columns and 3 Million rows (11GB each). Doing any transformations like String Manipulation (to lower) takes ages, grouping to do calculations takes ages. My INI is :
-startup
plugins/org.eclipse.equinox.launcher_1.3.200.v20160318-1642.jar
--launcher.library
plugins/org.eclipse.equinox.launcher.win32.win32.x86_64_1.1.400.v20160518-1444
--launcher.defaultAction
openFile
-vm
plugins/org.knime.binary.jre.win32.x86_64_1.8.0.152-01/jre/bin
-vmargs
-Dorg.knime.container.cellsinmemory=6000000
-server
-Dsun.java2d.d3d=false
-Dosgi.classloader.lock=classname
-XX:+UnlockDiagnosticVMOptions
-XX:+UnsyncloadClass
-Dsun.net.client.defaultReadTimeout=0
-XX:CompileCommand=exclude,javax/swing/text/GlyphView,getBreakSpot
-Xmx42G
-Dorg.eclipse.swt.browser.IEVersion=10001
-Dsun.awt.noerasebackground=true
-Dequinox.statechange.timeout=30000
I am running this on a Xeon 1535v6 at 3.1 4 core and as you can see below (images), the machine is not fully utilized. And yet a simple Group BY (No math takes 6 Minutes) and a string manipulation tales close to an hour. I have set yo keep all in memory.. Still not any better.
What can i do? (I have to run about 60 of these files a day ;)
Thank you