Atom Replacer Feature Request

Hi,

The Atom Replacer node is great for generating carbon skeletons from Murcko Scaffolds by replacing all atoms for C, for instance. I really like this for looking at large datasets and grouping all the carbon skeletons together to see which are most exemplified in the literature.

However, with a small enhancement, it could do much more. Is it possible to have the Atom Replacer node to have two options;

1. To replace all atoms with the user selected atom (as is the current situation).

2. To replace a specific atom of the users choice (i.e. Br, Cl, F, A, Attachment Points etc) with the user selected atom. This would be very useful, for instance when using the R Group Decomposition node, you would then be able to use this node on the Decomposed R Groups to replace the attachment points with "Br" for instance, and then search for these reagents to see if these are commercially available for undergoing the alkylation reaction, Buchwald reaction, etc..

Thanks,

Simon.

Simon,

I've not tried this at all so apologies if it is a non-starter, but would it be possible to achieve this using the RDKit 1-component reaction mode with a suitable reaction SMARTS defined?

Steve

Hi Simon,

Steve is right, if you need help with the SMARTS reaction notation, you know where to find me! :)

Regards,

George

Okay I can give that I go, but I have no idea how to define an attachment point!! which the R Group Decomposition nodes generate in Indigo and RDKit, nor am I very good with notation nomenclature.

It just seems a clumsy way of doing it to me, having to draw a reaction just to do such a simply transformation. It would be much more intuitive for a non-computational chemist (like myself) to be able to do it simply within a node described above rather than writing out SMARTS reaction notations.

I will try it, but for me, its not very user friendly!!

 

Simon.

Hi Simon,

I gave the suggestion a quick go this morning, and maybe the following helps(?).  I should say that I started with RDKit molecules, but had a problem with the RDKit R-group decomposition node (see http://tech.knime.org/forum/rdkit/problem-with-r-group-decomposer), so translated to Indigo.  Anyway, end result was:

  • a set of indoles tagged-up with R-groups by the Indigo R-group decomposition node
  • converted these to molecules (smiles)
  • converted smiles->MRV->smiles (MolConverter) because RDKit doesn't like the Indigo SMILES/SDF/MOL from the R-group decomposer, but Marvin sorts this out!
  • Now pass the smiles into RDKit's 1-component-rxn node, and as an example you can convert the R1 group to Br with the following rSMARTS:  [*:1]>>[Br:1]

The rSMARTS shown actually leaves any other R-group attachment points alone.  You can of-course do simpler or more complex transformations (eg replace with Ph - [*:1]>>[c:1]1ccccc1), but working with attachment points in this fashion has its limitations  - particularly if you want to remove all other attachment points / transform them all to Hs!

First option could be done with String Replacer, but I have tried and failed to get the * character to behave as a literal, rather than wild character!  Second option - can't (as far as I am aware) match all * atoms in one rSMARTS conversion because it maps as 'any atom'!

Anyway, I agree that this is not the most 'streamlined' system, but hope it may help in the short term

Kind regards

James

Thanks very much for looking into this and giving me a way to do this. But I'm glad you agree its not particularly streamlined, its all rather clumsy for doing what is relatively a simple transformation, which would be possible with a minor enhancement to the Atom Replacer node. I'm trying to ensure features in KNIME are as much "chemist friendly" as possible as some of my colleagues are less savvy when it comes to visualising structures as smiles/smarts strings, and if features can be manipulated within nodes in a more simple manner, I think more chemists who are non-computational will use it.

As you mention, in theory, the String Replacer node should be able to do this, but replacing a "*" is tricky it would seem. I just hope the Indigo team can offer this implementation :-)

 

Thanks,

Simon.

Thanks all!

I would certainly second Simon's comments re. usability for chemists (as a chemist), so it would be great to have a more straightforward solution.  For those dabbling with them and wanting to check their SMARTS strings, SMARTSviewer (http://smartsview.zbh.uni-hamburg.de/) is very useful.

Steve

Hi all,

@James, the SMARTS reaction [*:1]>>[Br:1] would attempt to convert any heavy atom to Br, which is certainly not what you want! The RDKit 1-component reaction has some weird but ultimately sensible behaviour with this reaction: for this SMILES: CCOC*, it returns 2 products, i.e. CCOCBr and BrCOC* because of the valence of Br, which can only replace terminal atoms. If you try this reaction: [*:1]>>[Si:1] for example, you'll get much more products by replacing iteratively all heavy atoms with Si.

The unambiguous way to replace an R-group (asterisk) with a given atom using a proper SMARTS reaction would be this one: [0*:1]>>[Br:1] where you match only atoms of zero atom mass, i.e. asterisks. With this way, one could also replace selected R-groups by "isotopic labelling".

As for the string replacement, you're right, the KNIME node cannot deal with a literal '*', however you may use the Perl node or even my favourite, the JPython function node, where you may enter the command: val("SmilesColumn").replace('*','Br'), where SmilesColumn is the column that holds the SMILES strings to be replaced.

Hope this helps, I really enjoy this thread!

Best regards,

George P.

Hi George,

Glad you are enjoying this thread - me too!  Thanks for the [0*:1]>>[Br:1] pointer.  When I 'tested' the [*:1]>>[Br:1] the only behaviour I saw was the *(1) pseudoatom being replaced by Br...  I didn't think to question this (but it was very early this morning!) and realise this was because RDKit wouldn't be so silly as to pass back nonsense results with Br in the ring!  The Si example that you gave makes this clear!

And of course the trusty JPython node is the way to go, until the String replacer is improved!

Kind regards

James

Simon,

As I understand you have found a way how to solve this but could you clarify your for the node requirements for Decomposed R Groups?

Do you need to replace R1, R2, R3 (and etc.) atoms with some special atom in the scaffold or do you need to attach atoms to the attachment points in R Groups?

Atom replacing node can now replace R sites (R1, R2, R3, ...) with the specified atom.

Best regards,
Mikhail

Hi Mikhail,

Thanks for the further improvements to the Atom Replacer node this is really good. Is it possible to just make some minor additions to the node please?

In the "Replace Specific Atoms" dropdown listin the Atom Replacer configuration window, please can you add two extra options;

1- Any R group. This will replace all atoms with the Rx label, i.e. R1, R2, R3 etc will all be replaced. At the moment you can use Custom and type in R1 etc, but this only allows one Rx group to be replaced at a time, it would be useful to be able to replace all of them.

2 - Attachment group. So in the output from the "R Group Decomposition" node, the "R-Group #1 etc" columns show the decomposed group and the attachment point is labelled with a zigzag line on the atom. I would like the ability to convert this zigzag line into a normal atom. I believe currently this is still not possible. So essentially I want to convert the attachment point to a Bromine for instance for searching for starting materials.

Overall Mikhail, this node is really nice and it makes it really user friendly for medicinal chemists to undertake simple transformations without having to use reaction nomenclature to do this which is not often very userfriendly. So thanks very much.

Thanks,

Simon.

Hello Simon,

The Atom Replacer node can now replace attachment points. You can download an updated version from the nightly builds.

R-groups replacing was implamented in the previous version. And I'm not sure whether we solved your request for R-groups. If you have any other suggestions for this node, let me know.

Best regards,
Mikhail

Yes you did solve my request, the Atom Replacer node is one of my favourite nodes. Its very powerful.

Thanks,

Simon.