Fingerprint similarity node

In the fingerprint similarity node config what does Min, Max, Avg mean? Which to use when?

Jiza

When you have a table of fingerprints, you are measuring fingerprint similarity against a template fingerprint. In this case, min, mean, and max does nothing.

however, if you have multiple template fingerprints, then you can choose what value is reported. So you can have the similarity value that was the minimum value reported across all template fingerprints, alternatively you can have the max reported similarity value, or each similarity value is taken per template fingerprint and the average taken.

simon.

Thanks Simon. When you say does nothing what do you mean? I have a MACCS of query molecule on one port and MACCS of about 15000 on the other port. It found one match. Choosing min, max, average gives me different Tanimoto value. That's what prompted my question.

Jiza

 

I think this is because your 15000 molecules are being used as the template molecules, and your 1 molecule is used as the query molecule in which it reports back a value for this one molecule. In this case, yes, you will get different results, as it will report back the min, max, or mean from the 15,000 template cpds.

if you had one template molecule (I.e your data set is connected the other way round on the in ports to the node), you should find these options have no effect.

simon.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.