I am connecting to a database (ORACLE) and executing the same query on multiple (~15) tables.
For this I am using a loop (Variable Loop Start) using the table name as variable. I established the database connection outside the loop. Nodes DB table selector, DB Query, DB Reader are within the loop.
Locally the workflow executes fine, but on KNIME server I receive often an error as the database connection breaks during the individual loop executions.
The question: Is it better to establish the connection within each loop execution or to check within the loop if the connection is still valid (try / catch) or what would be the recommended way to avoid the errors due to lost DB connection?
Thanks for any hints
I have hit this same problem many times when I have a large number of iterations and possibly the queries within each iteration are quite long running. I can find that my database connection gets “chopped” by the server because I have exceeded the connection time, or seemingly just because it “feels like it”. Having to restart a loop under those circumstances, or adding additional complexity to the workflow just to enable a partial-restart is painful.
For trivial (relatively quick) loops, I just do the type of thing you appear to have been doing which is grab a connection and perform the loops.
For non-trivial (longer running) loops though, I tend to follow this pattern:
LOOP START --- CONNECT DB --- DO SOME STUFF --- DISCONNECT DB --- LOOP END
You are of course free to mix-and-match where you do the db and non-db stuff, and the above is just a mock-up to give the idea rather than being a demonstration of serious database processing!
So on each iteration of the loop, the connection to Oracle is grabbed and utilised, with it being closed when no longer needed.
In some workflows, even within a single iteration of the loop, I may even have it open the connection, perform an action and close the connection several times. These days, Oracle is pretty quick at supplying the connection, and provided I remember to include the connection closer, it doesn’t appear to leave idle sessions hanging around and I have not experienced any problems with doing it this way.
Db connection handling can be frustrating. There are any number of reasons why a connection may get terminated without the workflow being “aware” and unfortunately I have not found a good way of testing that a connection is still available. If a db connection has been closed, I often find a DB Query Reader, for example, will appear to execute indefinitely without any error being returned, which is actually one of the reasons why I have now in many cases adopted my “connect–>query–>close” pattern.
Thanks for your reply. This sounds like a good approach in this case and will give it a try.
I experimented further with the advanced settings of Oracle Connector:
“Automatically reconnect to database” and “Restore database connection”.
With this my later tests did run as well. I have to monitor further.
I would be interested to hear your experience of whether those auto-reconnect settings make a difference. Of course it is difficult to fully test because you don’t really get to know when it auto-reconnected. The only time you know is when it doesn’t work
Personally I didn’t feel I had much success but this may be because the times it didn’t work were when I had a few very long running queries where the connection would be terminated while the query was still running and then it simply never returned (I had to manually stop the workflow)!
So that’s when I opted for always having a fresh connection at the start of such a query!