Clustering of molecules

sparel · March 25, 2014, 2:46pm

Hi,

I would like to cluster a set of molecules (100s - 1000s, or even more) based on chemical fingerprints (something like hierarchical or k-means clustering for instance), but I'm not sure which nodes would be the most appropriate. It doesn't need to be exact.

The output should be a cluster ID for each molecule, with an indication if a particular molecule is the cluster center or not.

Any idea how to do that ?

Thank you in advance

richards99 · March 25, 2014, 6:25pm

Try the k-Medoids node. This should work pretty well.

Use the RDKit Fingerprint node to generate the FPs (Morgan for instance), then use the Distance Matrix Calculate node to generate a Distance Matrix. Now connect this to the k-Medoids node, and specify how many clusters you would like. The cluster centre (Medoid) is reported also.

Simon.

sparel · March 26, 2014, 8:20am

Perfect ! That's exactly what I was looking for. Thank you for your help, Simon.

flyingmolecule · March 10, 2017, 3:55am

Hi,

got the same question here. I'm new to knime and want to cluster the molecules. I'm reading in smiles using line reader from a text file, and it worked. but when I tried running the RDkit from Molecule, it says 'No column in spec compatible to "SmilesValue" "SmartsValue" or "SdfValue". ' How can I solve it? Thanks!

greglandrum · March 10, 2017, 8:09am

Hi,

You need to be sure that the column coming from the File Reader is marked as a SMILES line.

There are two easy ways to do this:

Change the type directly in the File Reader node by clicking on the column header and setting the type to SMILES
Using the Molecule Type Cast node to convert the column to type SMILES after the table has already been read in.

After you do this the RDKit nodes should work without problems.

Note that you don't need a Molecule to RDKit node in order to generate molecular fingerprints for clustering. The RDKit Fingerprinter node can also directly process SMILES (or SDF) columns.

-greg

flyingmolecule · March 11, 2017, 2:31am

Thanks greg! It works!

system · April 21, 2023, 9:10pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.