Diversity Picker

Hello,

I have a problem. I try to use the RDKit Diversity Picker node to pick up the most active compounds. There are 100 molecules in the input (table 1). The number to pick is 20 and the Random seed is -1.

I repeated the picking process several times. And the results were not the same. The output files included different compounds each time.

Could you please help me? The KNIME version is 4.2.3

Welcome to the forum.

You need to actually pick a random seed that’s not -1. Any positive integer will work.

Thank you so much. It turns out that the random seed is the key. I am wondering how to determine the random seed.

Quan

It doesn’t matter. You can pick whichever positive random seed you want.

Thank you Greg. I tried different positive random seeds and obtained different results each time. I don’t know which is the best.

That’s like complaining that you get a different sample every time you do random sampling.

Using a specific random seed allows you to reproduce a random sampling event from run to run. If you change the random seed, of course you’ll get a different sampling result.

There is no such thing as “the best” random seed.
If you’re concerned about variance, then pick 3-5 seeds, do your analysis for each, and evaluate the variance in the results to see if it really matters.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.