Thank you both for your answers, I am now a little closer to this goal.
I opened my .sdf file in Chemdraw, added R-groups where necessary, then proceeded to select all of the compounds and copying them as SMILES.
Then paste into an excel file where each compound is separated by a .
. Import into KNIME, transpose, add column for the carboxylic acid SMARTS string (used the following : [R][C](=O)[OH]
). Then replaced all R
with *
.
This now works in the One Component Reaction node and generates the required transformations.
This is still a bit more work than I was initially planning and is essentially semi-automated. Although generating the lists of SMARTS reactions should hopefully be only a one-time occurrence.
Issue I still have :
- in KNIME : Going directly from an SDF file with R-groups or attachment points to a working SMILES string (that I could then “convert” into a working SMARTS string) is tricky. OpenBabel reads the R-group as “*” which is a pain to deal with later. Molecule Type Cast node gives the SDF string as opposed to the SMILES string (this still confuses me a lot…)
- I expect with this method I will have issues with doing replacement where there are more than 2 R groups as I am not mapping the atoms as suggested above. Would like to avoid doing this manually again…
I am still looking for a way to do this automagically from an SDF so I’ll leave this post open for a little longer.
Cheers,
Tony