This workflow is based on abstract KNIME Model Factory and is adapted to Life Sciences use case. Please refer to the blogpost for further details. The training data for the models comes from ChEMBLdb (https://www.ebi.ac.uk/chembl/beta/) and is publicly accessible. The workflow uses a sample of data for the test run. To run the workflow for the full set, remove Row Sampling node.To run this workflow on a system with distributed executors define the size of chunks in the Parallel Chunk Loop Start node depending on the resources of your cluster. Don't forget to leave a few executors for the management of the resources. E.g. for a cluster with 10 executors with 2 cores x 2 processors, a good number of chunks would be 36.
This is a companion discussion topic for the original entry at https://kni.me/w/duhMwpUAQ2wEOZJn