Kekulization / visualization bug?

Hi,

first of all - thanks for the RDKit updates - good job!

I've noticed that kekulization node doesn't work as expected. Here is an example:

http://www.knime.org/files/knime_social_media_white_paper.pdf

From left: source SDF, converted to RDKit, after kekulization.

There is no traces of aromaticy in the RDKit structures, even after kekulization (and Aromatization).

Hi,

the Aromatizer and Kekulizer node's changes are visible only, if you switch the renderer to show you the SMILES string or if you use for instance the Marvin Renderer on the SMILES string. The RDKit 2D Depictor renderer draws a molecule always the way you saw it, because it kekulizes the molecule on-the-fly before drawing it. So, it really is a visibility thing, but not considered a bug.

Kind regards,

Manuel

Ok, thanks for the answer.

best wishes,

Filip

Hi Manuel,

I know if I roll back time a year or two ago, this would have really confused me (I know that's not saying much). It's only traversing the dark side of comp chem that allows me to understand that it's just a visualisation issue as the smarts/smiles are represented correctly as aromatic.

however there will be lots of chemists who will be using knime who will be naive to this, that the visualisation is not necessarily representative of the actual string representation. I would consider trying to upgrade the rdkit visual depictions if possible to alleviate this probable issue with new users.

Thanks, simon.

Hi Simon,

I appreciate the point. However, given that the default depiction (the one showing the kekulized form of the structure) is preferred by almost everyone (even many of those who live in the dark world of comp chem!), I'm extremely reluctant to change it.

For people who really want to see the horrible circular aromaticity rendering: the marvin view is always available.

-greg

 

Hi Greg,

Thanks for the comments, I agree the circular aromaticity isnt pretty!

However, I can see comments and confusion from users coming up again, under the current setup. Howabout a nice easy fix by adding a comment into the Aromatiser node comments.

Something along the lines of "Note, although the output molecules are aromatised, and will be treated as such, the RDKit renderer in the table visualisation will still show them in kekulised form. However, other molecule renderers may visualise aromaticity differently."

Simon.

Nice one. I like the idea.

Hi guys,

I'm sorry if I come back to this problem again but I'm quite confused. If I upload an aromatized molecule in sdf format, then import it into RDKit and aromatize it, it seems that the loss of aromaticity is not only a rendering problem. In facts I tried to render the “RDKit Mol (Aromatized)” column with Marvin, Indigo and also sdf string but in all these cases I can see no aromaticity. As you can see in the picture in case of sdf srting the bond types are alternate singles and doubles but not aromatics.

Please, can anybody clarify this behavior?

Gio

I try to do substructure searches with the RDKit nodes. I draw a molecule in Marvin sketch and get the SMARTS:OC(=O)C1=CN(N=C1c1ccccc1)c1ccccc1

This does not get me a hit.

I copy the molecule from the node "RDKIT from Molecule" and paste it into Marvin Sketch and get this SMARTS: OC(=O)c1cn(nc1-c1ccccc1)-c1ccccc1.

This will get me hits. 

It seems OC(=O)c1cn(nc1-c1ccccc1)-c1ccccc1 is the same as O=C(O)c1cn(-c2ccccc2)nc1-c1ccccc1. Both work.

I tried Aromatizer and Kekulizer to convert OC(=O)C1=CN(N=C1c1ccccc1)c1ccccc1 into OC(=O)c1cn(nc1-c1ccccc1)-c1ccccc1. This means I draw the structure in Marvin Sketch and get OC(=O)C1=CN(N=C1c1ccccc1)c1ccccc1. I then try to use nodes to get the aromatic form, or I try to convert everything to alternating double bonds. I could not do it.

 

Hi,

At the moment there's no way to tell the RDKit SDF generator in Knime to skip the kekulization step, so you always get the localized form. We can look into it and see if there's a way to change it, but there's not a quick fix to this problem.

If you're willing to live with having your molecules as SMILES instead of SDF (I know this isn't the same), you will get the aromatic form when you use the "Molecule typecast" node to convert the RDKit molecule column to SMILES.

-greg

Hi,

Apologies in advance for the somewhat technical answer. It is unfortunately necessary to get a bit into the guts of chemical file formats, 

This SMARTS:

OC(=O)C1=CN(N=C1c1ccccc1)c1ccccc1

is asking for molecules that contain an aliphatic pyrazole ring. Since the RDKit considers pyrazoles to be aromatic, this will never match anything.

The SMILES:

OC(=O)c1cn(nc1-c1ccccc1)-c1ccccc1

has the pyrazole as aromatic, so it can generate matches.

If you construct an RDKit molecule from "OC(=O)C1=CN(N=C1c1ccccc1)c1ccccc1" as SMILES (instead of SMARTS), then the aromatization step which is automatically carried out when a SMILES (or SDF) is parsed will convert the pyrazole ring to its aromatic form, so you will get matches. This aromatization step is *not* carried out when a molecule is built from SMARTS (it would be incorrect to do so).

If you want to sketch query molecules it's generally safer to tell the Marvin Sketch node to output SMILES and not SMARTS. If you need to include query features and use SMARTS, then you should aromatize the molecule first in Marvin Sketch (Structure->Aromatic Form->Convert to Aromatic Form). An alternative is to tell the sketcher to export SDF cells: those can include query features and are generally correctly aromatized on input.

Best,

-greg

Hi Greg

Thanks a lot for the explanation. This helps.

Alex