parallel chunks - use full server power

linh · July 14, 2022, 4:10pm

Adjusting the threads in the KNIME settings is a good starting point to look at this performance issue. Maybe you can find additional useful information in this forum post: Does Parallel Chunk Start node uses the threads available?

Are you referring to a server with a KNIME license here?

What you can also try is to allocate your machine memory by modifying your knime.ini file or try the new table backend.

Since you are contemplating to split the data set in half and run two independent KNIME instances, you could also do the same in one KNIME instance, just making sure the results of both branches don’t need to be merged in the parallel chunk end (this is expensive, all data is copied), for instance by use of file readers/writers etc.

If you want, you can also provide a “jstack” from the KNIME process and share it with us, so that our developers can have a look.

Best regards,
Linh