I would like to make a diverse selection of compounds focussing on building blocks diversity.
I have a dataset containing several libraries (several scaffolds). One library is made of one scaffold and thousands of building blocks. Furthermore, in my dataset I have several chemical structure columns: one column for scaffolds, one column for building blocks and finally one column for final compounds.
My goal here is to make a diverse selection per library (1 scaffold = 1 library) by looking at the building blocks only and I do not want to use twice the same buiding blocks across the library selections.
In my workflow, I started with a group loop start (in order to group by scaffold), I continued with the reference filter node (in order to exclude building blocks already used in the loop), then I used the RDKit Diversity picker (diversity on the building blocks) and finally I used the loop end node.
The issue is that I am not able to connect the loop end information containing the selected compounds (column with already used building blocks) to the Reference Row Filter.
Any idea how to extract and inject variable information from a loop end node to a Reference Row Filter node?
Thank you in advance.
I would collect the information inside the loop. Use the Loop End (2ports) and inject it a data table with one row only, containing the Variable. You can get an Empty Table without columns but rows only using the Empty Table creator.
I am not sure to understand your suggestion.
Maybe my explanation is not clear. I need to collect the data from the loop end at each iteration and I would like to inject these backwards data into a forward Reference Row Filter.
I tried many possibilities but Knime does not let me connect any backwards data to a forward node.
Any idea how to solve this issue?
Many thanks in advance.
I think you want the recursive loop nodes but I'm only if I've guess what you mean by connect backwards data...
Yes, Sam, I think you are right.
Julien, the only way to use information from inside the loop in another iteration is the recursive loop end nodes.
Thank you Sam and Iris.
As suggested I tried the recursive loop.
I still have two problems.
First it seems that the recursive loop start node does not get any data back from the Recursive loop end node.
Second issue (certainly related to the first one), I do not know how to handle the iterative process in the end loop configuration. Indeed, as you can see, I have two loops and I would like the recursive Loop Start node following the iterations given by the Group Loop Start node. Then I want the recursive loop end node sending added-up information backwards at each iteration to the Recursive Loop Start node.
I will try again to explain with a concrete example what I'd like to achieve.
In my table, I have a column containing 20'000 molecules (unique), a second column containing 4 different scaffold numbers (this means 5'000 molecules per scaffolds) and a third column containing the building blocks used on each scaffold to make the final molecules.
I would like to make a diverse selections of 100 compounds per scaffold but I do not want to use twice the same building blocks across the selections. Finally, I should get 400 molecules from 4 different scaffolds and always decorated with a unique building block.
I tried with 2 loops. A group loop; to group per scaffold number. And since I need the recursive info, I tried to use the recursive loop node as well. But it seems that I would need a combination of both loops (group loop with a recursive option) in one loop and that should work.
What do you think? If you have any idea that would be very helpful.