Dear Knimers,
I encountered a problem with the DB Reader recently. It does not occur very often and it is quite hard to track, but when it occurs, it breaks my production jobs.
This is the thing: One of our batch jobs needs to read in 4 Tables from the MySQL Database of ours. The first and the second of these Tables are quite large (The queries return about 20 Million rows x 15 columns). From time to time, the second Database Reader Node fails on doing that with an error message like
"ERROR KNIME-Worker-6 Database Reader Execute failed: Communications link failure The last packet successfully received from the server was xxx,xxx milliseconds ago"
The xxx numbers vary, but mostly are between 400 and 2000 seconds. What I already tried:
- Connecting the Database Readers by FlowVariable Ports, so that they only get executed when the one before is finished. This might have added some stability, but the Nodes still fail from time to time.
- Increasing the Database timeout in the knime.ini: "-Dknime.database.timeout=600". Funny thing is, that I often get millisecond readings in the error message that are lesse than 600 seconds ago.
- Upgrading to the latest version of Knime
The problem seems to occur more often if the database is used by other Knime processes at the same time or when it is used interactively (not in a batch job). When executing the queries directly in the Database, the queries always run without a problem. Also, table locking is not a problem as I checked this already. It is also not a typical MySQL problem of "wait_timeout" as this is set to 28,800 seconds. The tables in question are inno_db tables.
I have three questions regarding this:
1) Might this be connected to the behaviour of the connection pool (i.e., not automatic reconnect?) => Would modifying the databse url in the reader by appending "?autoReconnect=TRUE" help?
2) Are there any possibilities to override the connection pool so that for every request in Knime a new connection is established? I have no Problem with many connections, but with the connection pool, I think.
3) On a related topic: Why is parallel reading from database Nodes not possible by default? The most database engines are constructed to handle concurrent reading quite well. In my case, not even using two different connections (Master and Slave) run in parallel.
Thanks in advance and best regards,
Dennis