Aromaticity in CDK

Hi guys,

I'm writing you because I need some clarification about aromaticity perception in CDK. On one hand we have that some CDK descriptors, implemented also in KNIME (e.g. aLogP) "assume that aromaticity has been detected before evaluating the descriptor" (ref.: http://cdk.github.io/cdk/1.4/docs/api/org/openscience/cdk/qsar/descriptors/molecular/ALOGPDescriptor.html). On the other hand in KNIME I cannot find any support in order to perceive aromaticity. I would expect 2 options (similarly of how they are implemented in RDKit): one to transform a molecule from kekulé form to the aromatic one, and another to do the opposite). Are they available in KNIME-CDK?

Using KNIME I noticed that when I import an aromatized molecule in CDK from an sdf file (having bonds order = 4 in the bonds block)  and I re-export it in another sdf file, I loose the aromatic information. To me it seem that CDK is not able to properly detect aromaticity from an sdf file. Instead if the aromatized molecule is read from a SMILES file (having low-case letters for aromatic atoms) then CDK correctly manage the aromatic information. It also works well if an aromatized sdf molecules is uploaded in CDK and then exported in SMILES. So I'm pretty confused about this issue.

Probably I'm misunderstanding something. Please can somebody clarify on this issue?

Many thanks,

Gio

Hi Gio,

thanks for your post. Aromaticity is perceived by default in KNIME-CDK. Whether molecules are displayed in Kekulé or aromatic form, they are annotated as aromatic. We removed the option to transform molecules from one representation to another some time ago in order to simplify usage. The visualisation preferences can be changed in the KNIME-CDK preference page.

Regarding bond order 4 and aromaticity in general: technically, values of bond type 4 through 8 are for SSS queries only and should not be used in SDfiles to indicate aromaticity. In my opinion, there simply isn't a need to indicate aromaticity in a file format from a Cheminformatics perspective. For more information, please see this blog post, which summarizes the issue around aromaticity very well. Even the SMILES representation used in KNIME-CDK shouldn't be aromatic SMILES but for various reasons (including my own weakness) I couldn't yet bring myself to eliminate the aromatic option. :)

Back to the problem at hand: Can you please clarify what you mean by 'CDK correctly manages the aromatic information' in the case of aromatic SMILES? Does that mean you get a SDfile out with bond type 4? That would indeed be a bug.

Also, would you mind sharing your use case? What do you need the aromaticity in the SDfile for? If you can do without it, this might be better long term. Sorry to answer your question with more questions. :)

Best regards,

Stephan

 

 

Dear Stephan,

Thank you for your quick and detailed answer. It's a relief knowing that CDK automatically detects aromaticity by default.

About bond order 4 and aromaticity: thank you so much for your clear answer. I'm not an expert of ctab and SDfile formalism and I wasn't aware of the fact that bond type 4 to 8 are reserved only for SSS queries. Now it's everything clear. I would also point other unaware users to this blog post (http://chem-bla-ics.blogspot.com.es/2011/10/cdk-file-formats-1-mdl-molfiles-and.html) explaining this issue very well.

Back to the problem: when I state “if the aromatized molecule is read from a SMILES file then CDK correctly manage the aromatic information” I just was meaning that the aromaticity in that case was shown in the CDK molecule renderer, while I wasn't see that when I was exporting to SDfile. But that was just because in the latter case the enabled renderer was the Marvin one and in the former the CDK one. So I was confused by the renderers. Anyway no molecule with bond order 4 is created and so no bug for CDK exists.

What do I need the aromaticity for? Well, I just wanted a way to check if the aromaticity of molecules (or better to say “conjugational equivalence”) was detected or not. Again my problem was that I was confusing the aromaticity detection of CDK (that now I know is always enabled) with the rendering of aromaticity (pictures with aromatic circles inside the rings) which can be disabled or enabled depending on the user preferences and the chosen renderer.

I hope this post can help also other users which are confused on this matter.

Stephan, thanks again for your help!

Gio

Excellent, thanks for sharing your experience and the link to the blog post. I'm sure that will help others.