RDKit Open 3D Alignment node

Hi Manuel,

Recently I have been using the RDKit Open3D alignment method via the python bindings (in IPython notebook), and it occured to me that I could/should test the same within KNIME using the RDKit nodes.

One option that Paolo just explained to me (when I bumped into him at the UKQSAR meeting yesterday) was the 'local optimisation' setting.  At the moment this is not exposed in the KNIME interface, but I thought an additional checkbox doing so would be useful for others as well(?).  [implementation detail: if checked, this woul just add 4 to the selected accuracy level - to give the final options int]

Just to put it in context a bit - what I am looking at is taking two overlayed ligand-bound structures (same protein, different ligands) and I wanted the MMFF atom typing and partial charges for the two ligands to dictate which 'important' atoms I could then look at RMSD values for (please don't scrutinise the idea too much - it is a work in progress!).  Anyway, the problem was that the Open3D alignment would try to give the 'best' global alignment, and what I want is a 'close' local alignment.


One other thing I have just noticed with the RDKit node - the aligned molecule doesn't seem to have hydrogen atoms even if they were explicitly present in the input molecule!  I'm sure I could add them - but I guess I wouldn't expect to need to, as they were present in the input...  From memory some others of the RDKit nodes also seem to strip explicit hydrogens, but I need to remind myself which.


Kind regards



We'll look into adding the local optimization option for the next release.

I can't reproduce the missing Hs problem... the attached workflow shows what I did. What piece am I missing?



Hi Greg,

Thanks for the example workflow.  I think this suggests that the problem may be a 'feature' of the molecule container auto-conversion...

In my example I was reading mol files in using the Molfile Reader - where the mols already had hydrogens and 3D coordinates.  I was then splitting the input to use the first structure as the reference; and passing the two streams directly to the O3A node.

I now presume that the auto-conversion to an RDKit molecule object in the container is using the same defaults as an explicit RDKit from Molecule node would use(?) - ie hydrogens are being stripped even though they are present in the input...

If this is the case, I think I may be back to 'explicit is always best' and just go back to always using converters...  Alternatively, can we consider making the auto-to-RDKit mol object as 'hands-off' as possible (perhaps this only needs to be wrt hydrogens)?