1. How do I prevent the ConcurrentModificationException error when using the Palladian HttpRetriever Node?
2. If I can't prevent it, how do I retry the HttpRetriever Node by wrapping it in a Try-Catch block?
Some background. I have a very large workflow for crawling website data. It takes about 6 hours to run overnight but soon I'll want it to run continuously. Any problem that causes the workflow to stop is fatal so I am trying desperately to fix even the smallest issues.
Last week I finally upgraded from KNIME 2.12.1 to KNIME 3.1.1 and I've started to notice the workflow occassionally failing at the Palladian HttpRetriever node - throwing a ConcurrentModificationException error. When I crawl a webpage I do it in parallel with several HttpRetriever nodes calling a GET at the same time (just like a regular browser would load a webpage from many different sources).
In the past I've wrapped problematic nodes with a looping Try-Catch block to have the workflow retry rather than stop. This is how I handled an intermittent problem writing to an overworked database:
Retry Database Writer Until Success (*)
(*) I posted my workflow at the bottom of the thread "How to handle the 15 minutes disconnect from twitter with knime twitter workflow"
Basically what I do is wrap the problematic node in a Try-Catch and then wrap that in a Recursive-Loop that loops until the node successfully completes what it was supposed to do. Rather complicated, but more on that later.
Unfortunately this technique doesn't work with the HttpRetriever Node because (I think) it is a two-port node - not a one-port node like the Database Writer. Consequently my other downstream nodes now complain that "Loop start and end nodes are not in the same workflow".
This is my theory about what is happening. The Try node pushes a call-block on top of the FlowVariable stack. The Catch node then pops that call-block off the top of the stack and allows the workflow to carry on. BUT the FlowVariables coming off the second data port of the HttpRetriever node (the Cookies data port) are now infected with the Try-Call-Block and are not Catched. This confuses the downstream Outer Loop-End which complains that "Loop start and end nodes are not in the same workflow".
It's complicated so I've attached some pictures. You can see the Try-Catch block around the HttpRetriever in the Inner Loop. And you can see the "Loop start and end nodes are not in the same workflow" error in the Outer Loop.
To fix my theory I tried adding another Catch to the second port of the HttpRetriever node. This is a bit weird but I've had some luck with this sort of thing in the past. No luck this time. But note here that the infection might still be leaking through the Recurisve Loop End port.
But looking at the big picture, this is all way too complicated for what I'm trying to do. All I want is an "On Error Try Again" loop which should be two nodes. Right now I'm up to 18 nodes and a very complicated mess!