FIngerPrint Similarity

Hi,

Thanks for all these extra chemistry nodes!

 

Please can you add an extra feature to the "FingerPrint Similarity" node. It would be good to allow the second port to accept more than one template fingerprint and then in the dialog box there is the option of comparing the first port fingerprints with all the second port fingerprints are returning back either the average similarity, the maximum similarity to one of the fingerprints, or the minimum similarity to one of the fingerprints.

Thanks

Simon.

Simon,

Thank you for the suggestion, it sounds completely reasonable. We have added this extra feature, and it will be available tomorrow with the next nightly build of Indigo nodes.

Regards,

Dmitry Pavlov
GGA Software Services LLC

Absolutely brilliant, these new features are really useful. Just been trying out and works great.

Thanks,

Simon.

Hi Simon, Hi Dmitry,
What are the differences between this indigo FP node and the Tanimoto FP distance from Tripos?

Regards,
B

Hi Dmitry,

I would have really prefered to have the 'Aggregation type' as an option ('Checkbox'). What if I want to get the exact similarity value (not min, max or aver) between a reference molecule and several molecules in a database?

Thanks,

M.

sisaym: if you need to get the exact similarity value between a reference molecule and several molecules in a database, you just pass the single reference fingerprint into the second input port. The aggregation type does not matter in that case: each of the "database" molecules will have its own (single) similarity value.

 

Best regards,

Dmitry

BJFR: I can not say exactly what is the difference (if any), because I do not have Tripos nodes. I just have tried to download and install them, but it does not work and prints error messages 

ERROR  RepositoryManager  Node com.tripos.knime.chem.node.moleculevalidator' from plugin 'com.tripos.knime.plugin.core' could not be created.
ERROR  RepositoryManager  Node com.tripos.knime.chem.node.moleculeparser' from plugin 'com.tripos.knime.plugin.core' could not be created.
ERROR  SMILES Reader  Execute failed: Could not initialize class com.tripos.common.ct.Translator

Anyway: while our Fingerprint Similarity node functionality may be the same as the Trpos' one, the Indigo fingerprints are different from Tripos'.

 

Best regards,

Dmitry

Thank you, how about if I have more than one reference molecule? It would be great if a column for each reference molecule is passed onto the output table (as a default or an option). The 'Aggregation type' is simpler to use but it would also be nice to have individual similarity values for later use. By the way, could you also add K-NN or Centroid options in the 'Aggregation type'.

Thanks again for the help!!

M.

Dmitry,
I got these one as well. Looks liké the node is quite sensible to formats even coming from symyx SDF reader nodes would not make it!
The work around I found is as follow:

  • go from the format yo have to smiles
  • inject the smiles in the unity FP node and then into the tanimoto dist node
    Note that the CDK FP, the RDK FP are accepted in the tripos tanimoto distance node.

Regards,
Bruno

Hi M,

You can get similarity values per individual fingerprint by using the Distance Matrix Calculator in KNIME under Distance Matrix. It will put all the values in one column separated by a comma, but you can split this to individual columns using the Cell Splitter node.

Hope this helps

Simon.

sisaym: Does the "Distance Matrix Calculator" node that Simon suggested work for you here?

As for K-NN and Centroid options: it seems that you are suggesting to perform K-means or K-NN clusterization of the input set of molecules, with reference molecules forming the clusters, right? It yes, then I would say that this task is a bit out of Indigo's scope, which is organic chemistry. KNIME has its own nodes for clusterization, are not they sufficient for that? If we can do anything to pass the data to KNIME's own nodes in more convenient way, then of course we will be happy to do that. What are your thoughts?

 

Best regards,

Dmitry

Hi,

How can I convert the Distance Matrix column to something the Cell Splitter node can handle?