Hello,
I used a javasnippet node with a an Rdkit script.
One of the users here @josemanuel managed to make it work with Linux version (4.02) but in my version on windows 10 it is not working. Once the snippet is executing, Knime crashes and closes before I can read any error.
The snippet worked for me normally with other PDB file but crashes with the one attached here.
My analysis of the problem: it worked with other pdb then the script itself and RDkit should be alright.
The pdb did not work for me but worked for a linux user, so I guess it is a problem in the windows version.
I attach the workflow and the pdb causing the problem.
Thanks,rdkit_from_pdb_2.zip (68.8 KB)
I am able to reproduce this on my windows machine. I will see if I can figure out whatās going on.
Hi @greglandrum,
I also rewrote the code in Python and this time I get a different error.
I put below the script that I used for the puthon script node and the error just underneath.
from rdkit import Chem
from rdkit import *
import pandas as pd
c1_PDBFiles = input_table[āPDB Filesā]
c_PDBFiles = c1_PDBFiles.to_string()
#fileName: name of the file to read
#sanitize: (optional) toggles sanitization of the molecule. Defaults to true.
#removeHs: (optional) toggles removing hydrogens from the molecule. This only make sense when sanitization is done. Defaults to true.
#flavor: (optional)
#proximityBonding: (optional) toggles automatic proximity bonding
out_RDKit = Chem.rdmolfiles.MolFromPDBFile(c_PDBFiles, True, True, 0, True);
Bad input file PF00012_1BUP_6_381_A_cropped_-preprocessed.pdb REMARK 4 COMPLIES WITH FORMAT V. 3.0, 1ā¦
Traceback (most recent call last):
File āā, line 17, in
OSError: Bad input file PF00012_1BUP_6_381_A_cropped-_preprocessed.pdb REMARK 4 COMPLIES WITH FORMAT V. 3.0, 1ā¦
PF00012_1ATR_6_384_A_cropped.zip (59.7 KB)
Iāve managed to at least narrow the problem down a bit more. The problem is actually happening when the RDKit tries to generate SMILES for the molecule (which happens with every RDKit molecule cell) and it seems to be related to the size of the protein.
I am going to continue to investigate, but this may not be something thatās immediately fixable.
@greglandrum,
I managed to execute internally the python script successfully. But Knime still crashes when I execute it in the workflow (the problem can also be related to save the output in the smile format as you mentioned).
I put below the corrected script:
output_table = input_table.copy()
from rdkit import Chem
from rdkit import *
import pandas as pd
#fileName: name of the file to read
#sanitize: (optional) toggles sanitization of the molecule. Defaults to true.
#removeHs: (optional) toggles removing hydrogens from the molecule. This only make sense when sanitization is done. Defaults to true.
#flavor: (optional)
#proximityBonding: (optional) toggles automatic proximity bonding
filename = list(dict(input_table[āLocationā]).values())[0].replace(āfile:/ā,"")
print(filename)
out_RDKit = Chem.rdmolfiles.MolFromPDBFile(filename, True, True, 0, True);
output_table[āout_RDKitā] = out_RDKit
print(output_table[āout_RDKitā])
yes, it looks like you will have the same problem no matter how you try and construct a molecule cell using this molecule
Thereās some kind of tricky problem in the core of the code (itās not connected to KNIME in any way) that only happens on Windows.
Iām not sure that Iām going to be able to fix this, but I will put a bit more time into it.
@greglandrum,
I need the RdkitMol format to calculate the molecular descriptors.
Is there another format that I can use instead of the RdkitMol.
Otherwise, I see that I can calculate the descriptors inside the python script directly without the need to save that format causing the problem?
When you say it is a tricky problem, do you mean it is a bug or a limitation in the number of residues in a peptide chain. I noticed that Knime crashed more often with quite large molecules and worked with the others.
If I understand the nature of this Rdkit problem, I can filter my proteins in a way that discards the entries that cause the problem.
This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.