Scaffold Finder


I like the "Scaffold Finder" node to find the maximum common scaffold, but is it possible to have an option to find commonly occurring scaffolds rather than just the Maximum Common Scaffold.

So basically scaffolds which come up alot in the molecules, but the scaffold is not present in every molecule. This would be useful in identifying subseries within a dataset to help cluster molecules together.




We have been thinking about this problem for a while and we realized that it is doable by making some adjustments to our present algorithm. We will schedule it to this summer, and as soon as it is done in the core Indigo library, we will arrange the corresponding node in KNIME.


Best regards,


Thanks for looking into it.

I was also using the Scaffold Finder on a dataset of indole and indazole cores, and the resultant MCS only returns a very small fragment of carbon atoms as the MCS. However, as a Medicinal Chemist, I would consider the indole and indazole to be the same MCS scaffold in which the 2 position can be either N or C (i.e. an A atom).

So the question is really, can the Scaffold Finder node simply analyse the framework of heavy atoms for the actual scaffold rather than seeing a change of heavy atom as a different scaffold. I think in this way, you can quickly get an idea from a dataset of molecules what the minimal required scaffold (or minimal required shape) is in a series to achieve activity.




Was there any progress in the work you were undertaking to looking at different algorithms in identifying common scaffolds.

Its always useful to have different approaches in identifying scaffolds so if you have something, I would be lovely to have it implemented in knime.