Dear KNIME and RDKit users,
I have a problem with the threshold parameter of the RDKit MCS node. In principle this parameter sets the fraction of molecules that the Maximum Common Substructure (MCS) must cover and ranges between 0.0 and 1.0.
As you can see from the attached example workflow below, if I run this node on a compound set having a common MCS of 28 atoms, it works as expected both if I run it with a threshold = 1.0 or a threshold = 0.5. If I run it with the same compound set where a couple of different compounds have been added, when I run it with a threshold = 1.0 I obtain an empty output. This is expected as I set up a threshold = 1.0, asking that MCS should cover all the compounds. Nevertheless, if l set the threshold to 0.5 I would expect to retrieve the MCS highlighted above, because more than 90% of the molecules share it. Unfortunately, I don’t retrieve any MCS in that case as the node gives an empty cell as output.
Am I missing something or is this a bug?
Thanks in advance for any help!
rdkit_mcs_threshold_problem.knwf (78.5 KB)