Dear KNIME RDKit Users,
we have a few updates for you that we just made available in the nightly build and as official releases for KNIME versions of the last 2 years.
What has changed?
Improvements and Updates:
- New RDKit Binaries 2023.09.2: Traditionally, we bring in the new binaries following the RDKit release cycle. More information you can find here. In our regression testing we discovered some changes related to
- AtomPair fingerprints
- SVG generation
- Optimize geometry, and
- Substructure counting with stereo-chemistry involvement
We are also aware of some bugfixes that had an effect on serialization and deserialization of RDKit molecules. This is important to know when working with RDKit Nodes in KNIME and Python scripts that also use RDKit features. Please make sure that the used Python environment has the same RDKit binaries installed as KNIME to avoid errors.
- Full support of KNIME’s File Handling Framework: Each RDKit node that deals with input or output files as part of its settings has been refactored to use now the KNIME capabilities to specify files from all kinds of file systems. If there is a KNIME Connector node to a file system (e.g. SSH Connector, S3 Connector, etc.) you should be able to specify files from it. We deprecated the old nodes, which means they will continue to work as before in existing workflows, but you can replace them yourself with the new node. The new nodes are used automatically whenever you drag and drop them from the node repository. The following nodes have been updated:
- RDKit Fingerprint Reader
- RDKit Fingerprint Writer
- RDKit Structure Normalizer
- RDKit Functional Group Filter
-
Long-awaited Feature for RDKit One and Two Component Reaction Nodes: We finally added the capability to bring columns from reactant input table(s) to the product output table. In an additional configuration tab, you may pick an arbitrary number of columns (e.g., just an ID column or a whole set of data) to be made available as part of the output.
-
Vendor information: NIBR (Novartis Institutes for BioMedical Research) has been renamed, and we updated the vendor name now to Novartis. This also includes all copyrights that are used in the RDKit open-source code.
-
For developers only - Dynamic Ports Support: The RDKit Nodes framework supports now dynamic ports in addition to the classic data ports. This makes it possible for instance to connect a file system coming from a KNIME Connector node.
Bugfixes in our KNIME code:
-
RDKit Rooted Fingerprints Had Wrong Results in KNIME: We have discovered that rooted fingerprints (incl. counted FPs) in KNIME are different from Python and pure Java. The reason we found is that there is a bug, how the atom list for the rooting feature is created before it passed to the RDKit native code. This has been fixed now. There will be slightly less on-bits in rooted fingerprints compared to the buggy version.
-
For developers only - Bugfix locations: The bug mentioned above was fixed in the following methods, in case you use them in your own code directly, please review:
org.rdkit.knime.util.InputDataInfo.getRDKitIntegerVector(DataRow row)
InputDataInfo.getRDKitUIntegerVector(DataRow row)
org.rdkit.knime.nodes.rdkitfingerprint.AbstractRDKitFingerprintNodeModel.createOutputFactories(final int outPort, final DataTableSpec inSpec)$AbstractRDKitCellFactoryprocess(final InputDataInfo[] arrInputDataInfo, final DataRow row, final long lUniqueWaveId)
The update is available
- for KNIME 4.6 as RDKit Feature 4.8.1, and
- for KNIME 4.7, 5.1 and 5.2: RDKit Feature 4.9.1
Thanks to Roman Balabanov (EPAM working for Novartis), @ptosco, @greglandrum and the rest of the RDKit community for their contributions! We wish all of you a Merry Christmas and a Happy New Year!
Kind regards,
Manuel Schwarze