Lingo similarity

I was wondering if lingo similarity could be installed in Indigo similarity nodes?



Lingo seems somewhat out of Indigo's scope. While Indigo deals with molecules, analyzing molecule structure, Lingo similarity algorithm deals with SMILES strings, analyzing string fragments. What advantages do Lingo similarity have over "normal" structure-based similarity (Indigo or other)?

Also, I have a feeling that Lingo algorithm may give different results depending on the ordering of atoms in given structures ("OCO" vs "C(O)O"), which is not very convenient.  What are your thoughts?


Nodes dealing with structure similarity based on FPs can sometimes be confused by extreme similarity (think about enantiomers that have only a chiral center that differentiates them)…
As an example, see the tanimoto similarity of CCC@HCO and CCC@@HCO two enantiomers of aminobutanol, the sim score is 1 (computed with indigo nodes and 10 qwords FP setting) however we know these are two different compounds. I think lingo can make a difference here.

