General SMARTS Patterns for any ring size

Hello,

I cant figure out how to write a general SMARTS pattern for some cases, so for example for the simple case of a nitrogen heterocycle, we can simply have [n,NR], but what if i am trying to match a nitrogen heterocyle that does not have an oxygen in the same cycle? I cant think of a general pattern that would work for any ring size. The only thing i can think off is to write one for each ring size like [#7R]1@[!#8R]@[!#8R]@[!#8R]1 for a 4 membered heterocyle, but this will quickly get tedious and as mentioned not a general solution. Thanks in advance!

Hi Jean,
Unfortunately I don’t have a solution for you. It’s a quite a technical cheminformatics question. I usually try the SMARTSEditor to understand what the right solution would look like.

I hope it helps.
Best,
Daria

1 Like

If you only want to consider aromatic rings, or rings upto a certain size, then you could enumerate the SMARTS patterns for each ring size computationally. It’s not pretty, but that is exactly what we did for the ring counts in our “The Medicinal Chemists Toolbox” paper (JMC 2011, 54, 3451-3479; DOI: 10.1021/jm200187y). I can’t think of any other way unfortunately.

Steve

1 Like

Hi Jean,

Sorry for the slow response - I only thought about your question when I saw Daria’s reply.
Interesting one!

I’m no SMARTS guru, but I would go for something like this - using recursive patterns:
[#7R;!$(@[#8]);!$(@@[#8]);!$(@@@[#8]);!$(@@@@[#8])]

So we are looking for a nitrogen atom in a ring that must also not be 1, 2, 3, or 4 ring bonds away from an oxygen. Of course, you still need to explicitly extend the recursive components to cover increasingly large ring sizes - but I think each additional pattern increases the ring size by 2 (so the example is good up to n=8).

The other way you could consider (depending on the size of your search space) is to fragment on all non-ring bonds, and exclude any molecules that have a fragment that matches this component SMARTS:

([#7R].[#8R])

Kind regards

James

1 Like

Just noticed that the * characters are not rendering properly in that last post!
As I can’t seem to figure-out how to edit a post, I’ve copy-pasted the above SMARTS here, and set it as preformatted text:

[#7R;!$(*@[#8]);!$(*@*@[#8]);!$(*@*@*@[#8]);!$(*@*@*@*@[#8])]

Sorry about that!

1 Like