How to extract molecular structures with Knime to prepare correctly datasets?

Currently, I would like to extract molecule's InChI from diverse public databases with CAS registery number. Special feature : I need to be ENANTIOSELECTIVE to prepare correctly my datasets.

I found that ChEMBL respect the enantioselective condition but the ChEMBL node extractors would not allow search by CAS registery number. I tried to extract CHEMBL_ID from diverse public databases with CAS registery number, "HtmlParser" node and XML extraction but that does not work when we make thousands iterations (lost conexion or maybe blocked by suppliers). After that, I tried the ChEBI nodes but this database does not contain all wanted molecules (200 out of 700 required molecules).

Have you any solution to solve my problem or any suggestion to refine my approach ?

Is there any particular reason that you need to start from CAS numbers? Those are a difficult identifier to use since the only authoritative (in fact, almost the only allowed) source of them is CAS itself, and you have to pay for that access.


you can use the CIR (Chemical Identifier Resolver) node available in community contributions. You can download it directly from KNIME.

Here the CIR web page on KNIME:

