How to cluster molecules based on their structure?



I've been using KNIME to screen a database of molecules against a certain target compound (substructure search, FTrees, fingerprints...). In the end I obtained a series of SD files. I'd like to merge these files and cluster the molecules based on their core scaffold. Is it possible to classify these structures (approximately 1000 compounds) according to their (central) structure?

In the near future, I'm planning to screen the database of 1000 compounds (from the virtual screen) manually and it would be nice if the structures are arranged in classes.


Kind regards,


You could use the "MoSS MCSS Molecule Similarity" node to compute a distance matrix based on the size of pairwise MCSS. Then use e.g. k-Medoid or Hhierarchical Clustering on the distance matrix.

There is a community example containing lots of different examples for this type of medchem searches:


(i.e. under examples, folder 050 Applications); I found that quite helpful and was able to develop some own workflows based on some of those, maybe that can help you.

Good morning,

I am interested in this workflow as it pertains to one of our workflows and may be able to help us. However, I am missing several nodes and receive this error message. When KNIME searches for the missing nodes, it cannot find them. Please advise…