I appreciate these two requests can be done by some manipulation with numerous combinations of nodes, but am thinking more of trying to keep operations simple for less advanced users.
1. can there be a node which will only return the lowest energy conformer per molecule in the table. Or as an option in existing optimisation node to only return lowest.
2. additionally in the existing optimisation node, can there be an additional column in the output, maybe as an optional extra, that returns the relative energy of the conformer to the lowest.
Greg and I discussed your request and are coming to the conclusion that additional options in either the Add Conformers or Optimize Geometry node would be rather unexpected and maybe even confusing for a user, because the Add Conformers node does not deal with energy right now, and the Optimize Geometry node on the other side is independent of the Add Conformers node. We may consider a new node for this, but we both think it would be sufficient to have maybe somewhere explained how to write a Meta Node to accomplish what you have in mind. If you have already something, maybe you want to share it with us and others as part of this thread. Otherwise, I can try to come up with a simple workflow for it. If it turns out to be really difficult, we may reconsider.
Thanks for the detailed reply. I agree trying to incorporate this into the existing nodes will be quite confusing.
Either a new node would be great to provide the lowest energy conformer per molecule, or by a metanode. I have constructed a metanode which can do this, and will pass the original columns from the user into the output too. If the metanode could be added into the RDKit node repository that would be super.
Below is the metanode, and each node settings mentioned:
RDKit Coordinates, default settings with 3D coordinates selected.
RDKit Add Conformers, In input, RDKit column is selected, and RowID used as reference.Number of conformers set to 20. The rest of settings kept as default.
RDKit Optimise Geometry, Remove Source Column Selected, RDKit Molecule (Conformers) chosen as Source column, and MMF94 as ForceField. New Column Name for Optimised Molecules is RDKit Molecule (Optimised).
Sorter, sort on Reference Ascending, and also on Energy Ascending.
GroupBy, Group on Reference, In Options choose RDkit Molecule (Conformers), and Reference and choose "First" as aggregate type for both. Keep original names for column naming.
Joiner, Join by RowID for left table and by Reference for right table. And use Inner Join method.
Hi Simon (and all others who are interested),
Thanks for your thoughts in your last post. Greg and I tried also to create some meta nodes and came up with two for each of your initial requests. Please find attached two demo workflows with meta nodes inside which ...
1. Determine the lowest energy conformer per molecule in the table. One meta node takes input molecules, the other the already calculated optimized conformers with already calculated energy values.
2. Determine the relative energy of a conformer to the lowest energy of all conformers of a molecule. One meta node takes input molecules, the other the already calculated optimized conformers with already calculated energy values.
I will make them available here at people's own risk. Companies who are using the KNIME Server could upload these meta nodes as meta node templates for the users. I will not add them to the RDKit Nodes, because I am uncertain how to deal with future compatibility issue which may arise when nodes that are used inside of such a meta node change over time or whenever I myself would update a meta node to a new version.
People who are in need of these nodes can copy them from a reference workflow or - as mentioned above - use the KNIME Server for sharing.
I attached two different versions because I discovered that the KNIME 2.8 workflow was not working properly in KNIME 2.9, and when saved with KNIME 2.9 it could not be imported back again in KNIME 2.8. So, please pick the approriate one for you.