Comparing molecules by Tversky similarity

evert.homan_scilifelab.se · March 10, 2017, 10:02am

Hi,

I would like to compare molecule files A and B and find all molecules in file B that have a Tversky similarity >0.8 to any molecule in file A, using structural fingerprints for comparison.

Any tips for an efficient workflow are highly appreciated!

Thanks/Evert

jonfuller · March 15, 2017, 4:05pm

Hi Evert,

I think that the following workflow should do what you need, with the following caveats.

a) The row splitter simulates File A and File B (I didn't have two SDF files with similar enough compounds).

b) Similarity calculated here is tanimoto. If you really need Tversky I think you'll need to use the Java Distance node to define the Tversky distance, and pass the output port into the Similarity Search node.

Best,

Jon

tversky_distance.knwf

Alastair2 · April 4, 2017, 2:06pm

The Indigo 2 fingerprint similarity node will directly calculate a Tversky similarity for you.

evert.homan_scilifelab.se · January 15, 2018, 4:49pm

I managed to combine the 2 proposed solutions, and made a workflow that generates a Tversky similarity column for each molecule of the reference molecules, by looping over them one at a time. This is still suprisingly fast.

Hopefully this is useful for other user.

Cheers/Evert