Identifying fragments of a multi-fragment structure

Given a set of structures such as
O.O.O.O.O.OC@@Hc3ccnc4ccccc34.OS(=O)(=O)O

I would like to be able to say ‘this molecule has 5 molecules of water.’ (There is a list of known salt counterions and solvents to use for matching fragments.)
I’ve gotten part of the way using the RDKit Molecule Extractor, which creates a new record for each fragment.
The next step is to put a name (or identifier) on each fragment.
The RDKit Molecule Substructure almost works.
It tells me that the the extracted water molecules match ‘[OH2]’ but [OH2] also matches the parent fragment, which is not what my team is looking for.

How can I perform an exact match of a set of generated fragments against a set of known molecules?

Regards,
Mitch

Hi @ChemMitch

I have to admit I am not sure I understood your question correctly, or if I did maybe your given SMILES is not the best example. But maybe that workflow will help you.
Best,
Alice

1 Like

Thank you, Alice!
That approach works well for me.

Regards,
Mitch

1 Like

Hello,

A follow-up question: it looks like the Molecular Substructure Filter is ignoring double bond stereochemistry. Fumarate and maleate match each other.
Is there a way to distinguish between them?

Mitch

Hi Mitch,
Though the RDKit supports using stereochemistry, that functionality is not yet exposed in the KNIME nodes.
This is something we can/should change… I will put it on the list.
-greg