RDF file reader/fixer/converter (replace or add-on to Erlwood Reactions File Reader)

For those who might deal with chemical reaction files and require a reader node, the Erlwood extensions offers a suitable one, though with at least two (for me) major drawbacks.
I addressed the following via a Python script:

  • Erlwood node doesn’t exist (yet) for Knime V4.3.x
  • It bugs out if you have RDF files with missing structures.
    The latter can happen e.g. for rdf exports from Scifinder or Reaxys.

See here on the KnimeHub:

This mini-workflow reads any number of RDF files, checks for the missing portions and fixes it (by simply elimination that record). In addition, a csv file is created with the structure converted to SMILES.
The main/important portion lies within the single Python node.

If you have worked with Reaxys or Scifinder RDF imports you will know that the number of resulting columns will differ. The same goes for the resulting csv files, even for the number of structure columns.

The Python script uses as little imports as possible, though it does require RDKIT in your Python installation. It is independent of the Knime Versions 4.x. Not tested in earlier 3.x Versions.

There is certainly room for improvement in the parsing or output, but it goes a long way.
Hope it helps the one or other person as well.


Hallo @docminus2,

Thanks for sharing this with the KNIME community. Definitely very helpful!


I have updated the script (and the description).
Major change is the inclusion of minimal sanitization of molecules else the module crashes in case of faulty molecules smiles.
(I would update my original post, but it seems it isn’t possible anymore)