RDKit Feature Request

Is it possible to have a RDKit Molecule PhysChem Calculator node which will return key PChem properties such as Molecular Weight, cLogP, cLogD, Polar Surface Area, Hydrogen Bond Acceptors, Hydrogen Bond Donors, Heavy Atom Count, Number of sp2 carbons, Number of sp3 carbons, Number of Heteroatoms, Number of Rotatable Bonds. These would be really handy, and save converting molecules into another format to get these properties (and no node will gather ALL these properties directly), and then convert back to RDKit again. If it was possible it would be really powerful to have some predictors around CYP inhibition, and Thermodynamic Solubility too, these are common features a medicinal chemist wants to find out about.




Adding a node that allows access to the RDKit descriptor calculators is on the ToDo list.

To explain why it hasn’t happened yet I have to get into a bit of technical detail. The RDKit has a pretty broad selection of descriptors available – http://code.google.com/p/rdkit/wiki/DescriptorsInTheRDKit – but many of them are implemented in Python. In order for the calculators to be accessible from Knime, they need to be available in the RDKit’s Java wrapper, which means they have to be implemented in C++. So the python-based descriptors would have to be re-written in C++. This is work that would pay off in more than just Knime (for example it would allow access to the functions from the RDKit’s database cartridge), but it’s still some work.

In the near(ish) term it probably makes sense to go ahead and do the descriptor-calculator node for Knime starting with the limited set of descriptors that are available now and then add to the list as more calculators are moved from Python -> C++.