RDKit Descriptor node request


In the RDKit Descriptor node, there is an option for "number of rings", but is it possible to have an additional descriptor for "number of aromatic rings". This is really useful to help gauge the probable solubility of the molecule within a series.

Additionally is it possible to have a descriptor "sp3 character %" which again helps gauge solubility. This would be: sp3 carbons/total carbons *100.



I'll put them both on the ToDo list.

In the meantime note that you can already approximate both of these using the substructure counter and SMARTS queries:

  • number of sp3 carbons: "[CX4]"
  • for number of aromatic rings you need to combine a couple queries: "a1aaaa1" and "a1aaaaa1". If you want to also include 7-rings: "a1aaaaaa1"

As an editorial aside, you might want to take a look at Peter Kenny's most recent paper, those two descriptors may not have quite as much to do with solubility as some other publications may indicate: http://www.springerlink.com/index/10.1007/s10822-012-9631-5


Thanks for the extra information as always Greg.

I'll take a look at this paper, it looks like a good review on the topic of drugability properties.!




I just posted a question in the general forum regarding the MQN descriptors (Nguyen, K. T., Blum, L. C., van Deursen, R., & Reymond, J.-L. (2009). Classification of Organic Molecules by Molecular Quantum Numbers. ChemMedChem, 4(11), 1803–1805. doi:10.1002/cmdc.200900317). Any chance of getting all of these implemented?





After skimming the list of descriptors, it looks like this should be relatively straightforward.

There are two obvious ways to do this:

  1. A separate MQN node
  2. An "MQN" set in the current descriptor node.

Do you prefer one over the other?



No preference! 



Peter Kenny's analysis aside, if you are adding aromatic rings then a full set of aromatic carbocyclic, aromatic heterocyclic, saturated carbocyclic and saturated heterocyclic might be useful, as per Ritchie, MacDonald et al.'s analysis, which is I assume where Simon if coming from?



Thanks for the additional suggestions. It should be no problem to get the full palette of ring definitions in.


Just a general reply to this thread: the new RDKit nightlies provide MQN descriptor as well as (what I think is) a full set of compositional descriptors.