I would like to make a one by one MCS comparison for a list of smiles strings using the "MCS Scaffold Finder" node. I tried using the "Chunk Loop" node to get through my list line by line - which is working fine. But how do I add a second internal loop? And what would be a good way to avoid comparing the same pair of molecules twice?
Any suggestion will be highly appreciated.
I see two tasks: a) clean up both lists before you into the loop(s), for example using the Reference Row Filter, and b) nested loops need to be completely embedded into the outer loop, that mean, the Loop Start - Loop Start - Loop End - Loop End. In your case, you need to have a connection from the Loop Start to the other Loop Start. Since the data connection is already reserved for the data ports, I would recommend using the (hidden) variable ports of those nodes to define the execution order. Hope this helps?
Hi, you should be able to avoid two loops.
use the maths node on both data streams. In the first have the expression row index, in the second have the row index+0.5. Then concatenate the data, then sort it. Your two data streams are now intermixed. So now just use one chunk loop start setting the chunk rows to 2, and now just one loop end.
as mentioned, use the reference row filter to remove duplicate molecules. I wouldn't worry about pairs being duplicated anyway as you could sort this out at the end using a groupby node on the smiles column, this way any duplicate MCS's will be removed.
Thanks for the comments. This was really useful!