MMP Molecule Fragment (RDKit) node IndexOutOfBoundsException

#1

Hello,
I cannot execute the “99_Community/04_Vernalis/03_Simple_MMP_Example” worflow available in KNIME examples server.
It fails on the MMP Molecule Fragment (RDKit) node, giving the following eror message:

INFO  MMP Molecule Fragment (RDKit) 2:2        Fragmentation SMIRKS: [#6+0;!$(*=,#[!#6]):1]!@!=!#[*:2]>>[*:1]-[*].[*:2]-[*] (Upto 1 cuts)
INFO  MMP Molecule Fragment (RDKit) 2:2        Using 6 threads and 120 queue items to parallel process...
INFO  MMP Molecule Fragment (RDKit) 2:2        Starting fragmentation at Wed Jun 12 11:30:04 CEST 2019
ERROR MMP Molecule Fragment (RDKit) 2:2        Execute failed: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0

2019-06-12 11:40:47,247 : DEBUG : KNIME-Worker-34 : Node : MMP Molecule Fragment (RDKit) : 2:2 : Execute failed: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
java.util.concurrent.ExecutionException: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    at com.vernalis.knime.mmp.nodes.fragutil.fragment.abstrct.AbstractMMPFragmentNodeModel$5.processFinished(AbstractMMPFragmentNodeModel.java:677)
    at org.knime.core.util.MultiThreadWorker.callProcessFinished(MultiThreadWorker.java:316)
    at org.knime.core.util.MultiThreadWorker.access$1(MultiThreadWorker.java:297)
    at org.knime.core.util.MultiThreadWorker$ComputationTask.done(MultiThreadWorker.java:462)
    at java.util.concurrent.FutureTask.finishCompletion(FutureTask.java:384)
    at java.util.concurrent.FutureTask.set(FutureTask.java:233)
    at java.util.concurrent.FutureTask.run(FutureTask.java:274)
    at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:328)
    at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:204)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
    at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
    at java.util.ArrayList.rangeCheck(ArrayList.java:657)
    at java.util.ArrayList.get(ArrayList.java:433)
    at com.vernalis.knime.mmp.fragmentors.RWMolFragmentationFactory.applyDirectionalBond(RWMolFragmentationFactory.java:449)
    at com.vernalis.knime.mmp.fragmentors.RWMolFragmentationFactory.assignCreatedDblBondGeometry(RWMolFragmentationFactory.java:1315)
    at com.vernalis.knime.mmp.fragmentors.RWMolFragmentationFactory.assignCreatedDblBondGeometry(RWMolFragmentationFactory.java:1)
    at com.vernalis.knime.mmp.fragmentors.AbstractFragmentationFactory.rawFragmentMoleculeAlongBond(AbstractFragmentationFactory.java:729)
    at com.vernalis.knime.mmp.fragmentors.AbstractFragmentationFactory.breakMoleculeAlongMatchingBonds(AbstractFragmentationFactory.java:401)
    at com.vernalis.knime.mmp.nodes.fragutil.fragment.abstrct.AbstractMMPFragmentNodeModel.runFragmentationsOnRow(AbstractMMPFragmentNodeModel.java:967)
    at com.vernalis.knime.mmp.nodes.fragutil.fragment.abstrct.AbstractMMPFragmentNodeModel$5.compute(AbstractMMPFragmentNodeModel.java:643)
    at com.vernalis.knime.mmp.nodes.fragutil.fragment.abstrct.AbstractMMPFragmentNodeModel$5.compute(AbstractMMPFragmentNodeModel.java:1)
    at org.knime.core.util.MultiThreadWorker$ComputationTask$1.call(MultiThreadWorker.java:442)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    ... 6 more

KNIME 3.7.2 on Windows 7 x64
Vernalis plugin 1.20.1
RDKit plugin 3.6.1

0 Likes

Update to v1.20.2 - MMP Bugfix
#2

Thanks for reporting this - I will take a look, as clearly that shouldn’t be happening.

Thanks

Steve

1 Like

#3

OK, an update. Firstly, thanks for the detail above - very helpful at locating where the problem is arising in the code. What I dont know yet is why this has come about. My guess is that something has changed in the innards of the RDKit code which this is built on.

The row which is causing the problem is this one:

image

My guess is that the problem arises when one of the N±CH3 bonds is broken, as the error lies in the part of the code in which stereochemistry is assigned to double bonds where there wasnt any previously. This molecule should give 2 isomers:

image
and
image

I will try to figure the cause behind this and get a fix out. I’m also going to tag @greglandrum here in case he has any obvious lightbulb moments about changes to the RDKit.

EDIT: I’ve now tracked down the problem:

When the above fragmentation happens, in order to differentiate E- and Z- fragments, the node looks at the C=N+ double bond, knows it did not have any stereochemistry assigned to it, and attempts to assign it. It does this as follows, for both ends in turn.

  1. Does the atom have a bond with assigned geometry? If yes, we’re done at this end of the double bond
  2. We dont have assigned geometry, but we need to have some here. Now we check each bond from the atom in turn, and only consider single bonds. Now we simply choose the single bond to the neighbouring atom with the lowest atom index, and we set this bond to have geometry (arbitrarily, we use ENDUPRIGHT, but we could have used any option as long as we are consistent).
  3. Repeat for the other end of the double bond.

This process is shown below for when the N[13]-C[14] bond is broken:

What happens is that atom 12 (indicated) is checked first, and previously the two highlighted bonds were identified as candidate bonds. The bond to atom 11 ‘wins’ (11 < 16), and so this bond is assigned ‘up’. then we repeat for atom 13 (the N+ atom), and the a bond to atom 14 (an attachment point * atom in this case) wins (14 < 15), so again this bond is assigned ‘up’ as shown.
Unfortunately, RDKit is now identifying the ring comprising atoms 9-10-11-12-16-17 as aromatic, and so the two highlighted bonds from atom 12 to atoms 11 and 16 are no longer ‘single’. The result is that the code currently finds no candidate bonds for direction assignment, but still blindly tries to find the first such bond in the list - and… BOOM! This is the error you are seeing. I’ve no idea why RDKit has changed behaviour here, but I think it is fixable. Hopefully I can get something out early next week.

(Incidentally, if you follow this through for the other N±CH3 bond breaking, you can see why the arbitrary-but-consistent approach works:
image
Steve

2 Likes

#4

This has now been fixed in nightly and stable builds for v 3.7 of KNIME - see Update to v1.20.2 - MMP Bugfix for details

Steve

1 Like

#5

Thank you Steve for your detailed issue description and its super fast fixing!

2 Likes