Chemical clustering with similarity to chemical structure

Hi all,
I am knew to using Knime and I am trying to do a selection of pesticides based on chemical structure. I have a data set of less than a 1000 compounds and would like to make clusters to selected 2-3 from each for some experimental work. I have been following some workflows with DBscan, k-means and hierarchical but my results don’t seem so great? not sure how to be able to visualise the clusters in a sort of chemical space map. any ideas?

Hey @reinyah94,

Welcome to the community!

How are you pre-processing the compounds? Do you clean them up or normalize them before running your clustering technique on them?

I would suggest to try the ‘RDKit Structure Normalizer’ and possibly the ‘Salt Stripper’ to see if you get any difference in results.

Or, if you would like there are some good suggestions from a thread:

For a very high level visualization, you can try and use t-SNE to visualize this. This is very high level and can give you a number in mind especially when you use k-means as you need to input a specific amount of clusters.

Also maybe this workflow may help you:

Hope this helps,
TL