Hidden feature of Fingerprint Similarity node


The CDK Fingerprint Similarity node has an undocumented feature when searching for the Max or Min Tanimoto similarity in that it returns a "Reference" column with the RowIDs which correspond to that Max or Min Similarity. This is a nice feature which other Fingerprint Similarity nodes in KNIME do not have, so it may be useful capturing this in the node dialog box.

See the post below which really helped James out.



I have noticed for Similarity ties, it separates the RowIDs with a | character. It may be more useful to have the "Reference" column as a Collection Type column so that the user can manipulate the list of RowIDs better (i.e. using UnGroup node or Split Collection Column node).




If I dare mention the java snippet (the old 'simple' one!), then setting the return type to string and checking the 'array return' box, and use the snippet

return $Reference$.split("|");

will have the desired effect in the interim (although I agree that the option to return as a set/collection might be useful)



now it is documented. :) Thanks for pointing that out.

I agree that it would be useful to offer the option to return a collection type cell, even though the Java snippet is still one useful node! I have altered the node to that effect.


Many thanks Stephan,

I have just tried out the changes and it works perfectly. The collection cell output format will very useful.