Multithreading

hinda · March 2, 2011, 1:41pm

Dear all

I would like to know if knime makes use of the different cores a PC has when running a workflow. And if there are ways to make the processing time faster.

Thanks in advance

thor · March 2, 2011, 2:53pm

KNIME can execute nodes in parallel if they are are not dependent on each other (independent branches in the flow). Also there are some nodes that process all rows independently in parallel. So yes, KNIME does use multiple cores if possible.

michael · March 9, 2011, 3:28pm

I have a further question: I am using a Variables Loop Node with an input of 6 different Variablesets within this node, I have a Cross Validation Node and within this Node, I have an SVM learner. I expect that each Variableset and each Crossvalidation can absolutally run in parallel even if the SVM learner will not support Multithreading. But Knime just uses 3 cores of my 8 available cores.

If I replace the SVM learner with an Fuzzy c Node, only one core is bussy. So it seems that crossvalidation and variables loop are not concurrent nodes? But http://www.ub.uni-konstanz.de/kops/volltexte/2008/6485/pdf/Parallel_and_Distributed_Data_Pipelining_with_KNIME.pdf says something else.

Is there something I missed?

thor · March 10, 2011, 10:06am

The paper you are mentioning describes a prototype that we did not yet include into the official release. Therefore all loop iterations are still executed sequentially. Only certain nodes can process their input data in parallel (if there are no interdepencies between rows).

michael · March 10, 2011, 10:51am

Thanks for your reply. Do you think its possible for me to get and use this prototype? Otherwise i have to create my own parallel Crossvalidation node.

c_koch · April 11, 2011, 12:00pm

Hi could you please also provide me with the parallel Cross-validation nodes?

thanks in advance,

cheers

chris

ccoble · April 21, 2011, 1:30pm

Pervasive has created a plugin for KNIME with their DataRush engine (http://www.pervasivedatarush.com/Products/DataRushforKNIME.aspx) that uses dataflow networks to create pipelined parallelism, it does a great job of taking full advantage of the cores available in my experience and comes with profiling tools so you can squeeze every bit out of your multicore machine. If the nodes you are looking for don't come directly out of the box with it, there is a Java API which allows you to create your own KNIME nodes using their pipelining engine. You can request a trial of the product at their site.

PRoberts · November 19, 2012, 4:57pm

Quick update: Pervasive has worked with KNIME to improve the KNIME API so that new nodes will be more easily parallel processing enabled, and added to their KNIME analytics accelerating stack. Here’s the new link for more info: http://bigdata.pervasive.com/Partners/KNIME.aspx

dhanu_knime · December 6, 2020, 2:04pm

Can multi threading be used within same work flow to process 4gb file in different ways at same time… please do provide node reference…