Python Script node converting RDKIT smiles to molecules freezes at 70%

Hi,
I have a problem with a simple python script node converting smiles to RDKIT molecules. The node freezes at 70% (executing the script inside with the node editor works). I read several related posts but could not find a solution yet (e.g. increasing memory in ini file …). It seems to be related to the "rdkit.Chem.rdchem.Mol object " as converting back to Smiles results in successful execution of the Python Script node.

System:

  • Virtual Box Ubuntu (host Windows10) (also tested on guest: Centos7 and Centos8)
  • KNIME 4.3.4 (also tested on KNIME 4.4.0)
  • Conda envs
    • RDKit (standard installation: conda create -c conda-forge -n my-rdkit-env rdkit)

It would be great if someone could help me out.

Best

Hi Anjo,

Would it be possible to post here the workflow you are displaying (even without data)? It would help us to check and maybe to find why it is not working as expected.

Best

Ael

Hi Ael,

thanks for your reply and trying to find a solution. I will attach the workflow. By the way it is executing on the Windows10 host so it may has sth. to do with the Virtual BOX.

Python_RDKit_Freeze.knwf (16.3 KB)

You may have to adjust the path in the configuration of the Conda_env component with the path to your RDKit conda environment.

Best

A

Hi @Anjo

My pleasure. It seems to work on my side. I have created a new RDKit environment from scratch and I’m calling it using the -Conda Environment Propagation- node. It may help you to reproduce this RDKit environment and compare it to yours.

Here below your workflow updated:

Python_RDKit_Freeze_with_Conda_Environment_Propagation.knwf (42.5 KB)

Hope it helps :wink:

best

Ael

Hi Ael,

many thanks for the workflow. I only could quickly test it yesterday on the Ubuntu Virtual Box and on the Windows host. In both the Conda Environment Propagation node failed to re-create the conda environment. Haven’t used this node before but that is one of it’s purpose right? I will see if I can re-create it in different way.

Another question: on which system did you run your workflow? That would be of interest as I think that the problem may lay within the Virtual Box (sorry not an expert here… but on every other system - not a VBOX - the conda env created from a yml worked).

Best

A.

Hi @Anjo,

The workflow (Python Scripts) works fine in itself (as proved by my test), but definitely there is a problem with the setting of your Conda environment.

Setting an RDKit Conda environment is quite easy. I would say there are at least two ways:

  1. Go to preferences → KNIME → Community Scripting → Python, and create a new environment with a name of your choice (but obviously different to those you already have created). I gave it here the name “py3_knime_RDKit” (but maybe you would need to give it a new one.

Once this environment is created, open an Anaconda Windows terminal and execute the following commands (highlighted here in boldface):
(base) PS C:\Users\aworker> conda activate py3_knime_RDKit
(base) PS C:\Users\aworker> conda install -c conda-forge rdkit

This is all. I would recommend to try this way and post here the error messages you may get when executing conda install -c conda-forge rdkit. It sounds that there are conflicts between some of the already installed python libraries in your Conda environment and the ones RDKit wants to install.

  1. The other alternative is the one you used initially as explained in the following RDKit web page:
    Installation — The RDKit 2021.09.1 documentation

I personally prefer the first one for a KNIME use but I would say they should be equivalent.

The -Conda Environment Propagation- node is trying to automatically reproduce what is achieved in option (1).

I’m working on a Windows 10 environment so maybe as you say, the problem is related to the Conda setting of your virtual Box Ubuntu on a Windows 10 machine.

Hope it helps.

Best

Ael

2 Likes

Hi @Anjo

Just a comment about my previous post. I wrote it based on a Windows installation but obviously, all what I said should be valid and executable too on a Linux terminal.

Best

Ael

Hi Alen,

many thanks for your suggestions. I followed your instructions (first option - create “New environment” within KNIME…). The environment was succesfully created and the “Conda Environment Propagation” node executes successfully. The Python Script nodes now executes with an error (see below). I haven’t had the time to “debug” but will post an update as soon as possible.

I am still a bit buffled because of the fact that the conda environment I used was tested and worked on different Linux systems/distributions and on Windows (together with the respective KNIME workflows). So transferring this environment to different systems and distributions worked in most cases. But when (succesfully) re-creating the environment on the Virtual Box Linux (and only there, and using the same distributions and KNIME versions) the same KNIME workflow fails. I know…the nature of things… just a bit frustrating :slight_smile:

I will give an update.

Again thanks a lot for your input I think with the error message I now can go forward.

Best

A

Hi Anjo,

Glad it helped and thanks for your feedback. Looking forward for your news and my pleasure to help.

Best wishes,

Ael

@Anjo I think the point is that depending on the settings Conda Environment Preopagation would try to reproduce the exact packages and versions of the Python modules wich tend to differ depending on the operation systems. I am not sure about if it only checks the names (in theory that should cover it).

What I would do is select a good basic conda configuration for KNIME (YAML file) suited for your operations system and basic thing you want to do (namely KNIME based Deep Learning or not which requires specific settings). You will have to settle on a major Python (3.6, 3.7, 3.8 …) version (the best one compatible with your specific package - in this case RDKit).

Then you should maybe limit the channels to “conda-forge” and maybe some additional pip installations.

If your special package would not immediately install via YAML that is fine just add it later. Once you have done that on your specific operationg system you can then store the whole thing in a Conda Environment Propagation and deply that to other people.

You might get an idea from this example:

We all look forward to the deeper integration of Anaconda/Python and KNIME that has been announced.

2 Likes

Hi @mlauber71,

please excuse my late reply. Yes, I agree I think I have to go back to my conda environment and built it step by step. I actually confined the conda channels to conda forge. I haven’t had the time yet to work further on it but I think this is the way to go.

Many thanks.

Best

A

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.