I am new to KNIME but already it feels like a really powerful tool.
What i would really like to do is import an SD file and perform a clustering protocol on the structures. I want to be able to select a subset of diverse molecules from a larger set.
Any help gratefully received.
Sorry, I missed your post.
There are several commercial packages available - some of which are free to academics. Our own includes a clustering capability based on 2D fingerprints. There are options to control whether the output is a fixed number of clusters or a variable number with a fixed similarity between members. See http://www.treweren.com for more details.
There is also a free-to-all set out called RDKit in the Community nodes. You can convert your molecules to RDKit, and then use the RDKit FingerPrinter to gather fingerprints on each molecule. Then use the Distance Matrix or Similar node to compare fingerprints.
I'm new to KNIME and have a similar problem.
I shouldn't really be making comparisons between KNIME and Pipeline Pilot, but it's difficult not to for this sort of analysis. In PP I'd cluster my compounds based on their fingerprint similarity and would be presented with a series of tables of structures. I'd be able to easily see how the compounds have been separated, without having to much about looking at dendrograms.
In KNIME it seems that Distance Matrix nodes are good for calculating the Tanimoto similarity, but the clustering visualization is no use to me.
Anyone know how to be presented with the kind of output I want?