Bug/crash: Templated Conformer Generation (RdKit)

Hi, the node “Templated Conformer Generation (RdKit)” is crashing knime immediately with my list of molecules. If I partition the list 15 molecules at a time, it works. With 20 molecules, it produces an error. With all 153 molecules, it crashes. I attached an example.

Templated Conformer Generation (RdKit)_bug.knwf (23.4 KB)

Sorry it’s taken a bit of time to get back to you with this. It’s not at all clear what is causing this behaviour at the moment.

On my machine I also see the same bombing out with the ‘top 20’ partition, which leads me to ask how many CPU cores you have on your machine?

Also, I tried running the bottom (full) table in a chunk loop, one row at a time, and I see some pretty wierd stuff here too - random rows have empty output, and each time I re-run it, it is different rows. I know there is another thread on this dataset at Good workflow for optimizing structure geometry?, but I am going to ask @greglandrum who has commented over there whether there is some sort of timeout built into the conformer generation? (Adding a random seed seems to make that consistent rather than random, so I’m guessing it is simple conformer generation failure somewhere along the way)

Steve (@Vernalis’ alter-ego!)

1 Like

OK, so some further investigation shows the error is caused in the attempt to use the forcefield to minimise the geometry:

ERROR	 KNIME-Worker-16 Node	 Execute failed: Unknown exception
java.lang.RuntimeException: Unknown exception
	at com.vernalis.knime.chem.pmi.nodes.confs.rdkitgenerate.RdkitConfgenNodeModel$7.getCells(RdkitConfgenNodeModel.java:518)
	at org.knime.core.data.container.RearrangeColumnsTable.calcNewCellsForRow(RearrangeColumnsTable.java:541)
	at org.knime.core.data.container.RearrangeColumnsTable$ConcurrentNewColCalculator.compute(RearrangeColumnsTable.java:769)
	at org.knime.core.data.container.RearrangeColumnsTable$ConcurrentNewColCalculator.compute(RearrangeColumnsTable.java:1)
	at org.knime.core.util.MultiThreadWorker$ComputationTask$1.call(MultiThreadWorker.java:442)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:328)
	at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:204)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
	at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: org.RDKit.GenericRDKitException
	at org.RDKit.RDKFuncsJNI.ForceField_minimize__SWIG_6(Native Method)
	at org.RDKit.ForceField.minimize(ForceField.java:99)
	at com.vernalis.knime.chem.pmi.nodes.confs.rdkitgenerate.RdkitConfgenNodeModel.generateConformers(RdkitConfgenNodeModel.java:763)
	at com.vernalis.knime.chem.pmi.nodes.confs.rdkitgenerate.RdkitConfgenNodeModel.access$18(RdkitConfgenNodeModel.java:681)
	at com.vernalis.knime.chem.pmi.nodes.confs.rdkitgenerate.RdkitConfgenNodeModel$7.getCells(RdkitConfgenNodeModel.java:453)
	... 11 more

In the source code, line 763 is here:

			ForceField ff = gc.markForCleanup(getForceField(useTethers,
					forceField, uffCleanup, waveId, match, templateConf, tmp),
					waveId);
			ff.initialize();
			if (iterations > 0 && ff.minimize(iterations) != 0) { // <--LINE 763
				// didnt minimise - skip
				continue;
			}

I don’t understand why that throws a wobble (again, I’m sort of hoping @greglandrum can throw some light on that question), although it should be possible to modify the code slightly to not fall over completely as a result…

The second problem - KNIME just disappearing without trace normally points to a segmentation error (when java tries to access a native C++ object which no longer exists), but normally if I am running KNIME from an Eclipse SDK debugger that throws a pile of junk into the Eclipse console saying that that is what has happened. In this case, that isnt happening, and the usual hs_err_pid[randomnumber].log isnt getting dumped out either. I will keep thinking about this, and see if I can track it down but KNIME isn’t giving me much to go on at the moment.

Steve

The RDKit conformation generation does not have a timeout.

Thanks for confirming that Greg. That’s what I thought - I think that bit of behaviour is just down to no valid conformers being found. The rest is a bit more mystifying!

Steve

Thanks for looking at this Steve.

For now, the “rdkit add conformers” node works well and will generate conformers for all molecules in the set I attached (as long as “use random coordinates as a starting point instead of distance geometry” is checked). I didn’t even need to assign a seed number. Greg attached the working workflow in the thread you linked above.

1 Like