I would like to align two molecules using the following nodes for each molecules: RDKit Add Hs => RDkit add Conformers => Rdkit Optimize Geometry => RDkit Remove Hs. Then for the reference compound, I select the 3D structure with the minimal energy from Rdkit Optimize Geometry and do a RDKit open 3D alignment. When I look into the result of the best score from RDKit open 3D alignement, I have the feeling that this could be optimized.
Is this approach appropriate for alignment of compounds to a reference compound? What would be the best method to select the best alignment? Could I use the best score result from the RDKit open 3D alignment?
Then I am wondering if the parameter could be fine tune to improve the workflow:
- For Rdkit Optimize Geometry, what would be your recommendation for Number of conformers (I put 200), Maximum number of tries to generate conformers (30) and RMS threshold for keeping a conformer (0.5). I use respectively 200, 30 and 0.5.
- How many iterations for Rdkit Optimize Geometry Iterations (I kept 1000 and I did not tick to remove starting coordinates before optimizing the molecule).
- For RDKit open 3D alignement, how many Maximal number of iterations? I use 200.
Thanks for your help,
it depends what to do on what you have and the molecules you use. I have such a workflow to compare to a reference molecule multiple other molecules. What I do is generate conformers for all molecules (reference and molecules of interested) and then do a cross-join of all references conformations on all target conformations. Yes this can explode quickly.
I use the generate Conformers node with the ETDKG methods and as that publications shows for smallish molecules it’s usually enough to generate 50 conformations (doing 100 or 200 doesn’t get you any closer to the real minimum). Of course there are exceptions to be made like macrocycles or in general highly flexible molecules. I disable uff clean up in that node as ETDKG big advantage should be not to need clean-up. However I use the optimize geometry node with MMFF94 but just 1 iteration to get energy values. This can help to remove conformers that are farther away from the lowest one than your threshold of choice.
I then do sort the results by 3D alignment score but also give the end-user the energy data so they can decide if that alignment is useful or not.
Anyway the main thing is the cross-join. If you just take 1 reference conformation, your needlessly restricting the actual possibilities. Also note that other tools use other methods to align and often preference on which one to use depends on the end-user.