Hi guys,
I use the Parallel Chunk Start and Parallel Chunk End node to parallelize some parts of my workflows. In general these nodes work well and it's always a pleasure to see how all the working threads of your machine are burning!
Nevertheless if in the Parallel Chunk Start I use the “Automatic chunk count” to parallelize, I always obtain 3 more chunks than those available in my machine. I also checked that the “number of working threads for all nodes” are correctly set in: Preferences → KNIME.
Please, do you know to what it is due this behavior?
"Automatic" is slightly overestimating. It's
(int)Math.ceil(1.5 * Runtime.getRuntime().availableProcessors())
So if the system has two processors it will be 3 and if it has four processor the thread count will be 6.
We haven't done a thorough analysis of this heuristic, though. I think the motivation is that you want to have slightly more chunks than CPUs because then more (smaller) jobs are being executed -- and hence you have more parallelism even if the first job(s) already complete -- but you do not want to have too many jobs as this will be a lot of job swapping.
OK Wiswedel, thank you for your reply! Now it makes sense and it's clear.
Cheers