execute multiple R processes

I am using Knime and R to analyze a large dataset.  I am using Knime filter nodes to select a subset of my data which I then pass to R.  When I wanted to filter my data in a different manner and perform the same test in R, rather than create a separate Knime workflow I duplicated my connected nodes into the same workflow such that I have two parallel workflows each with it's own end node  If I execute both of these workflows at the same time they both begin processing.  If I use top to view the cpu and memory utilization of the R processes on my linux machine, I see a single R process.  I had expected that when I executed both workflows at the same time I would see two R processes, one for each group of connected nodes.

Is there a way to configure my workflow such each group of nodes will execute a separate R process and thereby utilize the resources of my computer more efficiently?

 

Thanks, Doug

The execution of the R script itself runs concurrently, though the data transmission in/from R is tunneled through a R/Java library called JRI. This step (any moving of data between the processes and anything you do in the configuration dialog) is happening in this single-threaded library. The execution of the scripts itself is then done in separate R process(es).

Btw, in previous versions we used files to exchange data between R and Java ... but that was super slow compared to the current implementation.

So to answer your question I don't see how to optimize this process (unless inventing something completely new)....

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.