Hi,
I'm using the KNIME node Database Reader to access and retrieve information from a PostgreSQL 9.1 database. I have run into the following problem: I have to retrieve a table of ~200.000 rows and 780 columns. I prepared a query to retrieve all the table content in a single go.
- The query works fine in the pgAdmin or other SQL client.
- The query works fine in the KNIME node Database Reader if I execute it from the KNIME workbench (it requires ~5 minutes).
- But when I execute the same workflow in Batch mode, the process (workflow execution) never ends (I killed it after 2 days and a half). The log of the workflow shows no information other than the query is being executed (Database Reader 0:45 is the node which contains the query to retrieve this very large table):
...
Database Reader 0:45 has new state: CONFIGURED_QUEUED
2014-01-26 20:19:18,854 DEBUG KNIME-Worker-3 Java Snippet : Configure succeeded. (Java Snippet)
...
2014-01-26 20:19:18,857 DEBUG KNIME-Worker-3 Database Reader : Configure succeeded. (Database Reader)
2014-01-26 20:19:18,857 DEBUG KNIME-Worker-0 WorkflowManager : Database Reader 0:45 doBeforePreExecution
2014-01-26 20:19:18,857 DEBUG KNIME-Worker-0 NodeContainer : Database Reader 0:45 has new state: PREEXECUTE
2014-01-26 20:19:18,857 DEBUG KNIME-Worker-2 WorkflowManager : Table Creator 0:925 doBeforePreExecution
2014-01-26 20:19:18,857 DEBUG KNIME-Worker-2 NodeContainer : Table Creator 0:925 has new state: PREEXECUTE
2014-01-26 20:19:18,857 DEBUG KNIME-Worker-2 WorkflowManager : Table Creator 0:925 doBeforeExecution
2014-01-26 20:19:18,857 DEBUG KNIME-Worker-2 NodeContainer : Table Creator 0:925 has new state: EXECUTING
2014-01-26 20:19:18,857 DEBUG KNIME-Worker-2 LocalNodeExecutionJob : Table Creator 0:925 Start execute
2014-01-26 20:19:18,858 DEBUG KNIME-Worker-2 WorkflowFileStoreHandlerRepository : Adding handler 750476b0-56cf-4f7b-ae6b-559a35884975 (Table Creator 0:925: <no directory>) - 10 in total
2014-01-26 20:19:18,858 DEBUG KNIME-Worker-0 WorkflowManager : Database Reader 0:45 doBeforeExecution
2014-01-26 20:19:18,858 DEBUG KNIME-Worker-0 NodeContainer : Database Reader 0:45 has new state: EXECUTING
2014-01-26 20:19:18,858 DEBUG KNIME-Worker-0 LocalNodeExecutionJob : Database Reader 0:45 Start execute
2014-01-26 20:19:18,858 DEBUG KNIME-Worker-0 WorkflowFileStoreHandlerRepository : Adding handler eb439f58-684b-4078-a398-9046590d3240 (Database Reader 0:45: <no directory>) - 11 in total
2014-01-26 20:19:18,858 DEBUG pool-1-thread-1 DatabaseConnectionSettings : Opening database connection to "jdbc:postgresql://localhost:5432/DDBB"...
2014-01-26 20:19:18,858 DEBUG KNIME-TableIO-1 MemoryObjectTracker : Adding org.knime.core.data.container.Buffer$BufferMemoryReleasable (6 in total)
2014-01-26 20:19:18,859 INFO KNIME-Worker-2 LocalNodeExecutionJob : Table Creator 0:925 End execute (0 secs)
2014-01-26 20:19:18,859 DEBUG KNIME-Worker-2 WorkflowManager : Table Creator 0:925 doBeforePostExecution
2014-01-26 20:19:18,859 DEBUG KNIME-Worker-2 NodeContainer : Table Creator 0:925 has new state: POSTEXECUTE
2014-01-26 20:19:18,859 DEBUG KNIME-Worker-2 WorkflowManager : Table Creator 0:925 doAfterExecute - success
2014-01-26 20:19:18,859 DEBUG KNIME-Worker-2 NodeContainer : Table Creator 0:925 has new state: EXECUTED
2014-01-26 20:19:18,877 DEBUG KNIME-Worker-0 DatabaseReaderConnection : Executing SQL statement as executeQuery: SELECT
...
In this case the query does not return any information after more than two days of exectution.
In cases 2 and 3 I use the same parameters
-Dknime.database.fetchsize=20000
-Dknime.database.timeout=100
I cannot see why the behavior is different when the workflow is executed in KNIME Workbench or in Batch Mode.
Has anybody experienced a similar behaviour with very large tables ? Have you got any hint, advice to avoid these situations ?
Thanks in advance!
Oscar