Hi All,
Is there a way in knime to do diversity selection using eucliden distance or a using any distance matrix ? Is there any particular node or some long way to do the diversity selection?
Thank you
Kuldeep
Hi All,
Is there a way in knime to do diversity selection using eucliden distance or a using any distance matrix ? Is there any particular node or some long way to do the diversity selection?
Thank you
Kuldeep
kuldip,
input sdf, calculate properties or fingerprints, convert fingerprints to binary if necessary , use cluster create and apply nodes and then a sampling node and u are done! Just develop a workflow! It will give you much more control over the logic | U can even use dendrograms for visialization.
else use rdkit diversity picker node after converting from mol_>rdkit mol.
Thanks InsilicoConsulting.
I have already done the workflow here, but i dont want to use fingerprint on my structure, i am rather focusing on shape/geometrry of them. I have few properties which does that. Here are the steps of the workflow.
1) Read an SDF file
2) Generated best conformers(min Energy ) and calculate some geometry Descriptors.
3) Do PCA on the descriptor and try to capture 80% of variance.
4) used thies PCA component for K-means Clustering, It will generate the Cluster.
5) Finally i am using the sampling node here to get 1-2 sample from each cluster.
This is ok, but i want to find diverse shape in then rather than sample. Do Knime has Maxmin Dissimilarrity node? Thanks.
Kuldeep
If you already have the shape related descriptors as double value[s] thenwhy not use a loop and calculate the euclidean distance or soemthing else. Use the math express node inside the loop .
Also see if the distance matrix node helps.
Hi Kuldeep,
I am very interesting in your workflow. Can you give more information on the "geometry descriptors" that you use?
Thanks,
Lionel
Lionel, check out http://users.abo.fi/mivainio/shaep/index.php