FIngerPrint Similarity

richards99 · May 5, 2011, 8:01am

Hi,

Thanks for all these extra chemistry nodes!

Please can you add an extra feature to the "FingerPrint Similarity" node. It would be good to allow the second port to accept more than one template fingerprint and then in the dialog box there is the option of comparing the first port fingerprints with all the second port fingerprints are returning back either the average similarity, the maximum similarity to one of the fingerprints, or the minimum similarity to one of the fingerprints.

Thanks

Simon.

dpavlov · May 5, 2011, 7:02pm

Simon,

Thank you for the suggestion, it sounds completely reasonable. We have added this extra feature, and it will be available tomorrow with the next nightly build of Indigo nodes.

Regards,

Dmitry Pavlov
GGA Software Services LLC

richards99 · May 6, 2011, 7:21am

Absolutely brilliant, these new features are really useful. Just been trying out and works great.

Thanks,

Simon.

BJFR · May 10, 2011, 7:14am

Hi Simon, Hi Dmitry,
What are the differences between this indigo FP node and the Tanimoto FP distance from Tripos?

Regards,
B

sisaym · May 10, 2011, 11:27am

Hi Dmitry,

I would have really prefered to have the 'Aggregation type' as an option ('Checkbox'). What if I want to get the exact similarity value (not min, max or aver) between a reference molecule and several molecules in a database?

Thanks,

M.

dpavlov · May 10, 2011, 12:05pm

sisaym: if you need to get the exact similarity value between a reference molecule and several molecules in a database, you just pass the single reference fingerprint into the second input port. The aggregation type does not matter in that case: each of the "database" molecules will have its own (single) similarity value.

Best regards,

Dmitry

dpavlov · May 10, 2011, 12:42pm

BJFR: I can not say exactly what is the difference (if any), because I do not have Tripos nodes. I just have tried to download and install them, but it does not work and prints error messages

ERROR  RepositoryManager  Node com.tripos.knime.chem.node.moleculevalidator' from plugin 'com.tripos.knime.plugin.core' could not be created.
ERROR  RepositoryManager  Node com.tripos.knime.chem.node.moleculeparser' from plugin 'com.tripos.knime.plugin.core' could not be created.
ERROR  SMILES Reader  Execute failed: Could not initialize class com.tripos.common.ct.Translator

Anyway: while our Fingerprint Similarity node functionality may be the same as the Trpos' one, the Indigo fingerprints are different from Tripos'.

Best regards,

Dmitry

sisaym · May 10, 2011, 1:26pm

Thank you, how about if I have more than one reference molecule? It would be great if a column for each reference molecule is passed onto the output table (as a default or an option). The 'Aggregation type' is simpler to use but it would also be nice to have individual similarity values for later use. By the way, could you also add K-NN or Centroid options in the 'Aggregation type'.

Thanks again for the help!!

M.

BJFR · May 10, 2011, 8:35pm

Dmitry,
I got these one as well. Looks liké the node is quite sensible to formats even coming from symyx SDF reader nodes would not make it!
The work around I found is as follow:

go from the format yo have to smiles
inject the smiles in the unity FP node and then into the tanimoto dist node
Note that the CDK FP, the RDK FP are accepted in the tripos tanimoto distance node.

Regards,
Bruno

richards99 · May 11, 2011, 7:19am

Hi M,

You can get similarity values per individual fingerprint by using the Distance Matrix Calculator in KNIME under Distance Matrix. It will put all the values in one column separated by a comma, but you can split this to individual columns using the Cell Splitter node.

Hope this helps

Simon.

dpavlov · May 27, 2011, 1:38pm

sisaym: Does the "Distance Matrix Calculator" node that Simon suggested work for you here?

As for K-NN and Centroid options: it seems that you are suggesting to perform K-means or K-NN clusterization of the input set of molecules, with reference molecules forming the clusters, right? It yes, then I would say that this task is a bit out of Indigo's scope, which is organic chemistry. KNIME has its own nodes for clusterization, are not they sufficient for that? If we can do anything to pass the data to KNIME's own nodes in more convenient way, then of course we will be happy to do that. What are your thoughts?

Best regards,

Dmitry

Dr_Van_Nostrand · October 25, 2012, 12:56pm

Hi,

How can I convert the Distance Matrix column to something the Cell Splitter node can handle?

system · April 21, 2023, 9:31pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.