MCS node breaking up rings

Hi,

I've noticed some (I think) erroneous behaviour in the MCS node. It seems that under certain circumstances it will break open rings even when told not to.

I've attached a zipped up KNIME workflow to demonstrate the problem. It shows how mining a fuzzy MCS (in terms of element type) from a set of results in a SMARTS pattern containing aromatic bonds that no longer describe a ring. This is the SMARTS I'm getting:

[#6]-[#6,#7]-[#7,#6]1-[#6]-[#6]-[#7](-[#6]-[#6]-1)-[#6]1:[#6,#7]:[#6](:[#6]:[#6]:[#6]:[#6]):[#7,#6]:[#6]:[#7]:1

The input SMILES strings are:

FC(F)(F)C1=CC=CC=C1C1=CC=C2C(=C1)N=CN=C2N1CCN(CC1)C(=O)C=C

OCC#CC(=O)N1CCN(CC1)C1=C2C=C(Cl)C(=CC2=NC=N1)C1=CC=C(F)C=C1F

ClC1=CC=CC=C1C1=C2N=C(N=CC2=CC=C1)N1CCN(CC1)C(=O)C=C

ClC1=CC=C2C(=C1)N=CN=C2N1CCN(C(C1)C#N)C(=O)C=C

C=CC(=O)N1CCN(CC1C#N)C1=C2C=CC=CC2=NC=N1

ClC1=CC=C2C(=C1)N=CN=C2N1CCN(C(C1)C#N)C(=O)C=C

C=CC(=O)N1CCN(CC1)C1=C2C=CC(=CC2=NC=N1)C1=CC=CC=C1

ClC1=C(C=C2N=CN=C(N3CCC(CC3)NC(=O)C=C)C2=C1)C1=CC=CC=C1

The node configuration is as follows:

Threshold: 1.0

Ring matches ring only: checked

Complete rings only: checked

Match valences: unchecked

Atom comparisons: "Compare Any"

Bond comparisons: "Compare Order"

Timeout: 300

The node doesn't time out, but I've seen this behaviour more often when it does.

I think this behaviour is a bug and is definitely undesired for me, since I'm using this node to identify scaffolds with complete rings. Instead of breaking bonds, I would instead expect the node to produce a SMARTS pattern that just doesn't include the ring atoms.

Cheers,

Richard

Hi Richard,

This is a bug in the underlying C++ code. It's a known one (https://github.com/rdkit/rdkit/issues/945) that's been around for a while but that somehow hasn't been fixed yet. I will take a look and see if I can fix it.

Best,

-greg

p.s. Thanks for the providing the workflow to reproduce the problem!