Hi,
I have tried to replicate the KNIME +RDKIT + python workflow to get some QED values generated only to end up down a rabbit hole as I tried to install RDKIT in a conda environment first:
Install (conda, RKIT, QED) looked ok then tried the script in the KNime python node but the RDKIT chem function was not recognised.
So back I went into Jupyter and tried there (correct env also selected) and got the same issue.
As a last resort I also tried running the python line in the conda env, getting the same issue pointing in the direction of RDKIT missing some dependencies…
So far I have tried several fixes that litter the web to no avail (conda-forge, checking with ‘pip check’ for missing dependencies etc.)
Bottom line: is there an up-to-date, clean and easy way to install RDKIT in a conda env in Windows11?
Happy to install older versions of conda/knime/python to get it work…
Thanks
Hi,
first of all Happy New Year and thank you for the prompt reply.
I should mention I am a medicinal chemist by trade with a keen interest in CADD and for the past year I have been using Knime for scientific data mining.
The workflow I put together is a mere copy of the example outlined in a YT video by your colleagues combining Knime, RDKit and Python (where they run RDkit and QED functions in a python script node).
While I have been learning and using Knime for all of my med chem needs, I wanted to learn how to use the plethora of Github python scripts in Knime; while I appreciate this may not be the easiest way to do it outside a bona fide python environment, I see greater value in integrating the two.
I would like to emphasise Knime is working fine as always (I have 2 discrete versions, 4.5 and 4.7 to run specific nodes that sadly have not been updated over the years…).
I genuinely believe the issues lie with my python and conda builds that ‘dislike’ RDkit…
Will follow the guide you kindly suggested and report back!
Best,
Oz
Many thanks Evert!
Like you I have had issues first creating the correct environment then forcing Knime to use it.
The YAML environment works as intended: that’s half the battle.
I am trying to replicate your code by adapting to my env build (Pandas, RDKit and QED (from uamc-qed, github install).
Instead of a smiles table I am using a sdf Dotmatics export with cpd ID, smiles, activity etc
Could you point me in the right direction?
As you may have guessed I am new to python and I have much to learn…
Attached a simple workflow that calls the TDC environment called pytdc and uses smiles as input to calculate QED and SAscore. Hopefully this will get you started. If you have a file containing smiles you should be able to read this using regular file reader and use the smiles column (as regular string) as input instead. In the Python script this should be the column corresponding to
list = df[‘Smiles’].tolist()
Note that under the KNIME Python preferences I am using Conda configuration, not the bundled version. I have miniconda3 installed and there I created the pytdc environment as described in the earlier thread.
Many thanks Evert for taking the time to help and for sharing the w’flow.
All working fine with own data-well happy!
I am very grateful for all the help provided by you and the knime community.
Best,
Oz
PS
In case someone else runs into similar issues, I have added some notes below outlining some steps I have had to follow to get it to work (miniconda3/conda-forge build)
i) uninstall the conda-forge pytdc ($ conda remove pytdc), allowing up to 20min to complete the task
ii) install the python version ($ pip install pytdc, then $ pip install pytdc —update, though I believe it can be done in one line if it follows the lynux syntax)
iii) install knime 5.2 as a standalone version (zip download into a specific folder of choice- I have 3 different versions and all are working just fine-only doing this to run some legacy med chem nodes)
iv) installed networkx ($ pip install networkx)
v) selected my own conda-forge env from the drop-down menu in the Conda configuration node
vi) execute away!