searching for molecules

Is there a way to probe a list of molecules (sdf, mol, or SMILES) which ones exist in a second list of molecules? Possibly with an option to search for exact or similar molecules?



You may have a look at RDKit available from Community Contributions. Checking for the existence of a molecule could e.g. be performed by creating canonical Smiles for each molecules and then doing an inner join. There is also a substructure filter node which uses SMARTS for searching.

As Thor mentioned, the RDKit nodes are very good with substructure searching available. To simply see which molecules in one list are present in a second list, use "Molecules to RDKit" then "RDKit Canonical Smiles" to make sure they are all formatted the same and then use Reference Row Filter from the standard KNIME set of nodes.