substructure search with rdkit

jada · January 7, 2011, 2:17pm

hi guys,

we have some problems with a knime project: We have two databases and want to take each molecule from the first database to search for substructures in the second one using the rdkit-substructure filter.

we know how to find substructures for one molecule. our problem is to automatize the process, that every loop the next molecule from the first db is taken to search for substructures in the second db. maybe we have to use flow variables, but we dont know how...

thanks for your help!

best regards,

jada

greglandrum · January 7, 2011, 5:58pm

Hi Jada,
A knime node to do to substructure searches of the molecules on input 2 using the molecules on input 1 as queries is something that we’ve talked about doing. A concrete question for you: if you have M molecules and N queries, what would you like to find in the output table?

-greg

James_Davidson · January 20, 2011, 10:16am

Hi Jada,

Sorry for the rather slow response - I guess you may already have worked-around this problem by now(?) If not, then I think the answer is "yes" you can accomplish this using flow variables with the current RDKit nodes.

I have made a quick example workflow (Knime 2.3.0) that starts with two small SMILES 'databases' and loops over each mol in the 2nd to use as input into the Substructure Filter node that is processing the 1st. There is also a little bit of processing afterwards to join back in the query mols to double-check the match.

I'm not sure if there is an easy way to share workflows on the Knime site(?), but I couldn't find one - and I couldn't upload a .zip file to my GoogleDocs - so I have setup a 'RapidShare' account and put the example there:

http://rapidshare.com/files/443533423/Two_DB_Substructure_Query.zip

Hope this helps.

Kind regards

James

system · April 21, 2023, 9:12pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.