How to define a peptide bond within a ring, using SMARTS?

Hi there,
I am struggling with defining a structure. I have a bunch a structures who are belonging to the different groups. I recognized them and sorted them manually using a singe feature and that is peptide bond within cyclic part of structures.
I thought about using SMARTs to define that single feature, and use it to put it in the workflow, but i could not find a smart code for a peptide bond within a ring part of a structures.

Does anyone know the smart code for peptide bond within a ring, or have some other idea how i sort them out by that feature?

Thank you

1 Like

Can you provide some examples of the types of molecules you like to sort? It would be helpful to see what you’d like to keep and the what you’d like to discard.

I did a quick test using 3 molecules -

Erythromycin
C[C@@H]([C@@H]([C@H](C(O[C@@H]([C@@]([C@H](O)[C@@H](C)C1=O)(C)O)CC)=O)C)O[C@H]2C[C@]([C@@H](O)[C@H](C)O2)(OC)C)[C@H]([C@](C)(O)C[C@H]1C)O[C@H](O[C@@H](C[C@@H]3N(C)C)C)[C@@H]3O

Erythromycin but with the ester swapped for an amide:
C[C@@H]([C@@H]([C@H](C(N[C@@H]([C@@]([C@H](O)[C@@H](C)C1=O)(C)O)CC)=O)C)O[C@H]2C[C@]([C@@H](O)[C@H](C)O2)(OC)C)[C@H]([C@](C)(O)C[C@H]1C)O[C@H](O[C@@H](C[C@@H]3N(C)C)C)[C@@H]3O

Tris(lysine):
NC(CCCNC(N)=N)C(NC(CCCNC(N)=N)C(NC(CCCNC(N)=N)C(O)=O)=O)=O

Using 2 sequential RDKit Substructure Filters, I was able to pull out only the erythromycin with the amide bond.

The first filter used the SMARTS query [NX3][CX3](=[OX1])[#6] to match amide bonds. The second filter used the SMARTS query [r{6-}] to match macrocycles with a ring size larger than 6. The number can be customized based on your use case.

2 Likes

If you really just want peptide bonds in rings, you can probably use an RDKit Substructure Filter and tweak @elsamuel’s SMARTS to directly include ring queries: [NRX3]-@[CRX3](=[OX1])-@[#6R]
That constrains each of the backbone atoms and bonds to be in a ring.

3 Likes

Thank you so much for your help. I implemented it and it worked. It has sorted the structures by that criteria. What I exactly want is to define a structure as a cyclopeptide when it has at least 3 peptide bond. For example when it has at least 3 peptide bonds then is a cyclopeptide ( and i would give a value 1), when it has not then is not a cyclopeptide( and it has value 0). Like to do it in a binary way.
But is it possible to write that kind of condition in via SMARTS.

Thank you :slight_smile:

One approach is to use the RDKit Substructure Counter node you to calculate the number of times a peptide bond occurs in a given molecule. Then you can use a Rule Engine node to apply your desired condition.

2 Likes

Thank you guys so much. It worked as you described it. :smiley:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.