RDF file reader/fixer/converter (replace or add-on to Erlwood Reactions File Reader)

For those who might deal with chemical reaction files and require a reader node, the Erlwood extensions offers a suitable one, though with at least two (for me) major drawbacks.
I addressed the following via a Python script:

  • Erlwood node doesn’t exist (yet) for Knime V4.3.x
  • It bugs out if you have RDF files with missing structures.
    The latter can happen e.g. for rdf exports from Scifinder or Reaxys.

See here on the KnimeHub:

This mini-workflow reads any number of RDF files, checks for the missing portions and fixes it (by simply elimination that record). In addition, a csv file is created with the structure converted to SMILES.
The main/important portion lies within the single Python node.

If you have worked with Reaxys or Scifinder RDF imports you will know that the number of resulting columns will differ. The same goes for the resulting csv files, even for the number of structure columns.

The Python script uses as little imports as possible, though it does require RDKIT in your Python installation. It is independent of the Knime Versions 4.x. Not tested in earlier 3.x Versions.

There is certainly room for improvement in the parsing or output, but it goes a long way.
Hope it helps the one or other person as well.

3 Likes

Hallo @docminus2,

Thanks for sharing this with the KNIME community. Definitely very helpful!

Cheers,
Janina