ERROR Compute LeadIT Docking Execute failed: Cannot read file "knime_container"

manueljaeger · November 3, 2015, 10:05am

Hello,

as an beginning Knime user (ubuntu 14.04, 64 bit, KNIME 2.12.0) I am struggeling with the following Error:

ERROR Compute LeadIT Docking 0:1006:974:1010:60 Execute failed: Cannot read file "knime_container_20151102_8785946112934572709.bin.gz"

This error seems to occur with when I am trying to dock more than ~40.000 molecules with BiosolveITs LeadIT Docking node. Also often the containers are also present in the /tmp folder, but not always. In the log you can see that the docking node seems to reset itself after which it does not find the container anymore (see below or in more detail in the attachment). And afterward it complains that there are too many open files. I am a little desperate here, I checked already this here from the forum: https://tech.knime.org/forum/knime-users/knime-forgets-tables but that should not be a problem since my ubuntu delete all temporary files only on restart.

Any suggestions are highly welcome...

015-11-03 03:33:41,207 : DEBUG : KNIME-Worker-89 : Buffer : Compute LeadIT Docking : 0:1006:974:977:60 : Opening input stream on file "/tmp/knime_Virtual Screeni9557/knime_container_20151102_6491373320711165774.bin.gz", 144 open streams
2015-11-03 03:33:41,228 : DEBUG : KNIME-Worker-108 : Buffer : Compute LeadIT Docking : 0:1006:974:1055:60 : Opening input stream on file "/tmp/knime_Virtual Screeni9557/knime_container_20151102_1533247244771051553.bin.gz", 345 open streams
2015-11-03 03:33:41,237 : DEBUG : KNIME-Worker-92 : Compute LeadIT Docking : Compute LeadIT Docking : 0:1006:974:1010:60 : reset
2015-11-03 03:33:41,237 : ERROR : KNIME-Worker-92 : Compute LeadIT Docking : Compute LeadIT Docking : 0:1006:974:1010:60 : Execute failed: Cannot read file "knime_container_20151102_8785946112934572709.bin.gz"
2015-11-03 03:33:41,237 : DEBUG : KNIME-Worker-79 : Buffer : Compute LeadIT Docking : 0:1006:971:60 : Opening input stream on file "/tmp/knime_Virtual Screeni9557/knime_container_20151102_123262422028646276.bin.gz", 346 open streams
2015-11-03 03:33:41,238 : DEBUG : KNIME-Worker-92 : Compute LeadIT Docking : Compute LeadIT Docking : 0:1006:974:1010:60 : Execute failed: Cannot read file "knime_container_20151102_8785946112934572709.bin.gz"
java.lang.RuntimeException: Cannot read file "knime_container_20151102_8785946112934572709.bin.gz"
   at org.knime.core.data.container.Buffer.iterator(Buffer.java:1674)
   at org.knime.core.data.container.ContainerTable.iterator(ContainerTable.java:126)
   at org.knime.core.node.BufferedDataTable.iterator(BufferedDataTable.java:317)
   at biosolveit.toolkit.Toolkit.findDataRowWithMoleculeName(Toolkit.java:798)
   at biosolveit.toolkit.Toolkit.readMoleculeToBufferedDataTable(Toolkit.java:949)
   at biosolveit.flexx.docking_v3.FlexXNodeModel.execute(FlexXNodeModel.java:226)
   at org.knime.core.node.NodeModel.executeModel(NodeModel.java:563)
   at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1136)
   at org.knime.core.node.Node.execute(Node.java:932)
   at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:554)
   at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:95)
   at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:179)
   at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:110)
   at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:328)
   at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:204)
   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
   at java.util.concurrent.FutureTask.run(Unknown Source)
   at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
   at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: java.io.FileNotFoundException: /tmp/knime_Virtual Screeni9557/knime_container_20151102_8785946112934572709.bin.gz (Too many open files)
   at java.io.FileInputStream.open(Native Method)
   at java.io.FileInputStream.<init>(Unknown Source)
   at org.knime.core.data.container.BufferFromFileIteratorVersion20.<init>(BufferFromFileIteratorVersion20.java:116)
   at org.knime.core.data.container.Buffer.iterator(Buffer.java:1662)
   ... 18 more

cannotfindknimecontainer.txt

Alastair2 · May 2, 2016, 2:33pm

Hi Manuel,

I find it useful to break such large sets of molecules into chunks of ~1000 and using Chunk Loops or Parallel Loops to run the docking. I'd suggest building a new workflow to do that, and empyting out the tmp folder before re-starting. To test the loops are working OK, I'd also recommend trying out on a smaller subset first (25-100 molecules) - just insert a row sampling node before the loop start, run the loop, and once it is working, simply delete the row sampling node, reconnect and execute.

Alastair

thor · May 3, 2016, 8:34am

As you already figured out yourself, the error message is quite clear about the cause. There is a limit of the number of files an application may have open which is obviously too low. You can either increase the limit and/or check why there are so many open files. If you go to /proc/<pid>/fd you see a symlink for every open file (<pid> is the process id of KNIME). Usually there are a few dozen knime_container*.bin.gz files open but the number should be way below the standard OS limit.

Alastair2 · May 9, 2016, 2:37pm

I'd also recommend using an External Tool node to run the LeadIT executable in batch mode, as in my experience this uses a lot less memory for these large virtual screens than the dedicated Compute LeadIT Docking node.