I'm using the RDKit Molecule Substructure Filter to filter molecules using small fragments (generated with the MoSS node) as queries. Everything works as desired 95% of the time but in certain cases the query is identified inside a fused ring (e.g. piridine found in a quinoline) which I would like to avoid. Is there any to do this?
Reading other topics, I get the feeling that I should make the queries more explicit but I'm not sure how.
Thanks & best regards,
Adam
edit: Ideally I would like to prevent matches in aromatic rings but still find large aliphatic macrocycles containing the query. I made an attempt to do this with a script but couldnt discern between the ring types.
Anyways, an example of what I would like to achieve is NOT finding C1(C2=CC=CC=C2)=NC=CC=N1 within C12=C(C=CC=C2)C=NC(C3=CC=CC=C3)=N1
If possible, I would also like to control whether or not the first structure is found in something like C1(C2=CC=CC=C2)=NC=C(CC3)C(CCOCCN4C=CC5=CC=CC3=C54)=N1 but this is less important than the first issue.
This is not currently possible directly within the RDKit nodes, but one of the ideas for a new node that we've already captured (https://github.com/rdkit/knime-rdkit/issues/4) would allow this kind of modification of query molecules.
This node is on the list of possibles for the hackathon at the upcoming RDKit UGM, so it may end up showing up reasonably quickly.