R execution error inside Parallel Loops

Recently I have been encountering an error when using R View nodes inside parallel loops (KNIME: v3.4.0; R: v3.4.0). The specific error I get is:

ERROR R View (Table)       9:4278:4328:4152:4156 Execute failed: R evaluation failed.: "sink();sink(type='message')
close(knime.stdout.con);close(knime.stderr.con)
knime.output.ret<-c(paste(knime.stdout,collapse='\n'), paste(knime.stderr,collapse='\n'))
knime.output.ret"

This error causes the node to fail with a red 'X'. To fix this all you need to do is manually execute the node again and it will work. The loop will continue for 1-10 iterations and then fail again with an identical error.

I am trying to generate reproducible conditions for this error, but haven't been able to do so yet. Has anyone else come across this particular error recently?

 

Hi Joshua!

Yes, I have heard of this bug before, but haven't been able to reproduce it yet, either. If you want to send me the relevant portions of your workflow: jonathan.hale@knime.com

Thank you!
Jonathan.

Hi Joshua,

 

I have faced this issue when executing R command

rm(list=setdiff(ls(),'df'))

 

do you have any solution? My KNIME version is 3.4.1, with R version 3.3.3.

 

Thanks

Hi, it’s not only in parallel loops but in all kind of loops :
ERROR R Snippet 0:68 Execute failed: R evaluation failed.: “sink();sink(type=‘message’)
close(knime.stdout.con);close(knime.stderr.con)
knime.output.ret<-c(paste(knime.stdout,collapse=’\n’), paste(knime.stderr,collapse=’\n’))
knime.output.ret”
Does a try catch with the Rnode could work in order to execute the same R code in case the first has this error ?

Fabien

Hi, this issue still exists in KNIME AP 3.7.1.

I get exactly the same error. Node stops, red cross. Re-run: proceeds without problems…

Did anybody manage to identify the underlying problem?

I stopped receiving this error once I switched to a fresh install (of everything) after a laptop upgrade. Unfortunately a lot changed as part of this upgrade (hardware as well as versions of both R and KNIME) so I can’t attribute it to any one thing specifically.

From gut feeling (without knowledge of the code) I would assume some kind of race condition for seed of the random generator which generates the used filenames.
I circumvented the problem by brute force: Inside of the parallel chunk nodes is a loop (“Generic Loop Start” – “Variable Condition Loop End”) which contains a “Try (Data Ports)” - “Catch Errors (Data Ports)” node pair containing the actual work. The loop iterates till the try - catch succeeds. The whole solution is quiet ugly, especially because I needed two outputs from the internal nodes but the “Variable Condition Loop End” only allows one output, which needed another work around.
Due to the complex inner workings and additional inputs into the look the parallel chunk nodes had problems identifying the nodes contained in the loop. This was solved by putting everything between the parallel chunks into a wrapped metanode.

Thanks. Yesterday I implemented a similar solution (which I got from here). And indeed this solves the problem.

Dirty hack, but it works…

Just in case anybody of the KNIME team is following this thread: I’ve attached a workflow that can re-produce the issue. It launches four R-snippets in a loop, and this loop is re-run until the error occurs. The error occurs reproducibly in KNIME 3.7.1 (MacOS) and 3.7.0 (Linux) in my hands, only a matter of waiting…

I’ve seen it occur after like 60 iterations, but some times I have to wait to ~700 iterations, but so far <1000 iterations it always crashes. I guess what Andreas writes above is true: probably some kind of randomly generated filenames colliding somehow…

Would be much appreciated if an expert could take a look. Even though I’ve addressed it in my current workflow for now, the fact that this thread is 1.5 years old suggest that people have been having this problem for some time…

190327 R issue.knwf (37.1 KB)

5 Likes

I can reproduce the error on my Mac (it came at iteration 1003).

In the error log there is a part about a Rserve timeout that has been reached with 30000ms. Not sure where this originates from or if this is connected to the problem.

2019-03-27 22:23:30,834 : DEBUG : KNIME-Worker-7 : RController : R Snippet : 2:994 : Attempt #1 to connect to Rserve failed (waited 0ms, timeout 30000ms) org.rosuda.REngine.Rserve.RserveException: Cannot connect: Connection refused (Connection refused)

knime_r_log_20190328.log (16.0 KB)

2 Likes

Thanks!

Looks like we confirmed it is reproducible. @jonathan.hale : Do you have enough information now to take a look again at this?

Cheers,
Sander

1 Like

@sandernabuurs,

great work! I’ll look into it. Just a short question - if you execute the workflow and kill the RServe process manually via your task-manager. Is the resulting workflow state the same?

@Mark_Ortmann,

Yes, I can confirm a manual kill of the Rserve process results in the same error.

1 Like

@sandernabuurs,

thank you. I’ll have a look at it!

@sandernabuurs,

we have finally fixed the issue(s) and they will be part of the upcoming bugfix release. Thanks for the workflow - it helped a lot!

Best
Mark

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.