Execute failed: An exception occured while running the Python kernel.

Hello. I´ve been having problems executing Pyhon Scripts. Sometimes it runs but others I have this error.
Execute failed: An exception occured while running the Python kernel. See log for details.

And the log says as follow:
2023-01-23 09:02:13,174 : ERROR : KNIME-Worker-215-Python Script 3:1640:1641:1638 : : Node : Python Script : 3:1640:1641:1638 : Execute failed: An exception occured while running the Python kernel. See log for details.
org.knime.python2.kernel.PythonIOException: An exception occured while running the Python kernel. See log for details.
at org.knime.python3.scripting.Python3KernelBackend.putDataTable(Python3KernelBackend.java:447)
at org.knime.python2.kernel.PythonKernel.putDataTable(PythonKernel.java:304)
at org.knime.python2.ports.DataTableInputPort.execute(DataTableInputPort.java:116)
at org.knime.python3.scripting.nodes.AbstractPythonScriptingNodeModel.execute(AbstractPythonScriptingNodeModel.java:229)
at org.knime.core.node.NodeModel.executeModel(NodeModel.java:549)
at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1267)
at org.knime.core.node.Node.execute(Node.java:1041)
at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:595)
at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:98)
at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:201)
at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:117)
at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:367)
at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:221)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: py4j.Py4JException: An exception was raised by the Python Proxy. Return Message: Traceback (most recent call last):
File “C:\Users\lsand\anaconda3\envs\KNIME\lib\site-packages\py4j\clientserver.py”, line 617, in _call_proxy
return_value = getattr(self.pool[obj_id], method)(*params)
File “D:\knime_4.6.4\plugins\org.knime.python3.scripting_4.7.0.v202211291148\src\main\python_kernel_launcher.py”, line 466, in setInputTable
self._backends.set_input_table(table_index, java_table_data_source)
File “D:\knime_4.6.4\plugins\org.knime.python3.scripting_4.7.0.v202211291148\src\main\python_kernel_launcher.py”, line 229, in set_input_table
table_data_source = kg.data_source_mapper(java_table_data_source)
File “D:\knime_4.6.4\plugins\org.knime.python3_4.7.0.v202211291350\src\main\python\knime_backend_gateway.py”, line 267, in data_source_mapper
return DATA_SOURCESidentifier
File “D:\knime_4.6.4\plugins\org.knime.python3.arrow_4.7.0.v202211291117\src\main\python\knime_arrow_backend.py”, line 220, in init
self._reader = _OffsetBasedRecordBatchFileReader(
File “D:\knime_4.6.4\plugins\org.knime.python3.arrow_4.7.0.v202211291117\src\main\python\knime_arrow_backend.py”, line 191, in init
self.schema = pa.ipc.read_schema(self._source_file)
File “pyarrow\ipc.pxi”, line 1100, in pyarrow.lib.read_schema
File “pyarrow\error.pxi”, line 144, in pyarrow.lib.pyarrow_internal_check_status
File “pyarrow\error.pxi”, line 115, in pyarrow.lib.check_status
OSError: Read out of bounds (offset = 8, size = 4) in file of size 6

at py4j.Protocol.getReturnValue(Protocol.java:476)
at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:108)
at jdk.proxy16/jdk.proxy16.$Proxy46.setInputTable(Unknown Source)
at org.knime.python3.scripting.Python3KernelBackend$PutDataTableTask.call(Python3KernelBackend.java:744)
at org.knime.python3.scripting.Python3KernelBackend$PutDataTableTask.call(Python3KernelBackend.java:1)
at org.knime.core.util.ThreadUtils$CallableWithContextImpl.callWithContext(ThreadUtils.java:383)
at org.knime.core.util.ThreadUtils$CallableWithContext.call(ThreadUtils.java:269)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)

Hi Isandinop,

thanks for sharing the knime log! You use 4.7.0 and the latest Python Script node, correct?
Could you provide the script or a workflow so that we can investigate further?

Best regards
Steffen

1 Like

Hello Steffen
Yes, I’m using KNIME 4.7.0 and the lastest Python Script node because the legacy ones don’t work anymore.

This is the script. I could share the workflow but not the data.

import knime.scripting.io as knio
import re
import pandas as pd
import spacy
from spacy.language import Language
from spacy.matcher import PhraseMatcher
from spacy.tokens import Span
from heapq import nsmallest
from spacy.tokens import Doc

input_table_1 = knio.input_tables[0].to_pandas()
input_table_2 = knio.input_tables[1].to_pandas()
input_table_3 = knio.input_tables[2].to_pandas()

##########################################
nlp = spacy.load(“es_core_news_sm”, disable=[‘ner’,“tagger”, “parser”, “lemmatizer”,‘attribute_ruler’ ])

config = {“punct_chars”: [“.”,“\n”,“\t”,“*”]}
sentencizer = nlp.add_pipe(“sentencizer”, config=config, after=‘morphologizer’)

############################################
nlpFrase = spacy.load(“es_core_news_sm”, disable=[‘ner’,“tagger”, “parser”, “lemmatizer”,‘attribute_ruler’ ])
targets = input_table_2[‘TARGET_P’]
target_patterns = list(nlpFrase.pipe(targets))

matcher = PhraseMatcher(nlpFrase.vocab)
matcher.add(“TARGET”, None, *target_patterns)

Define el componente personalizado

@Language.component(“target_component”)
def target_component_function(doc):
# Aplica el matcher al doc
matches = matcher(doc)
# Crea un Span para cada resultado y asígnales el label “TARGET”

spans = [x] 
for i in range(len(matches)):
    valor = 0
    case = 0
    estaEntreOtro = "FALSE"
    for j in range(len(matches)):
        if(i != j):
            if(matches[i][1]>=matches[j][1]) and (matches[i][2]<=matches[j][2]):
                estaEntreOtro = "TRUE"
            elif (matches[i][1]<=matches[j][1] and matches[i][2]>matches[j][1]):
                valor=matches[j][1]
                case=1

            elif (matches[i][1]<=matches[j][2] and matches[i][2]>matches[j][2]):
                valor=matches[j][2]
                case=2
    if estaEntreOtro == "FALSE":
        #if case==1:
            #print("entra case1:", matches[i][1], valor)
            #spans.append(Span(doc, matches[i][1], valor, label="TARGET"))
        if case==2:

            spans.append(Span(doc, valor, matches[i][2], label="TARGET"))
        else:
            spans.append(Span(doc, matches[i][1], matches[i][2], label="TARGET"))

# Sobrescribe los doc.ents con los spans resultantes
doc.ents = spans
return doc

Añade el componente al pipeline después del componente “ner”

nlpFrase.add_pipe(“target_component”, after=‘tok2vec’)

#################################

ID=
EVOLUCION=
FRASE=
TARGET=
NEGADO=

for i in range(len(input_table_1)):
docTotal=nlp(input_table_1[‘COLANALISIS’][i])

antFamiliares = ['FAMILIARES:','FAMILIARES :','ENFERMEDADES CON FACTOR HEREDITARIO:','ENFERMEDADES CON FACTOR HEREDITARIO :']

for fraseA in list(docTotal.sents):
    frase = str(fraseA)
    fraseFinal=frase
    for antecedente in antFamiliares:
        indiceAntecedente = frase.find(antecedente.lower())
        #fr=frase
        if indiceAntecedente>=0:
            frA=frase[:indiceAntecedente]
            frD=frase[(indiceAntecedente+len(antecedente)):]
            corte = frD.find(":")
            if corte<0:
                corte=len(frD)
            frD = frD[corte:]
            fraseFinal=frA+" "+frD
            
    doc=nlpFrase(fraseFinal)
    for ent in doc.ents:
        #print("frase: ",  fraseFinal)
        if ent.label_ == 'TARGET':
            ID.append(input_table_1['ID'][i])
            EVOLUCION.append(input_table_1['EVOLUCION'][i])
            FRASE.append(fraseFinal)
            TARGET.append(ent.text)        
            negado = "FALSE"
            for neg in input_table_3['Negaciones']:
                #negacion="no " + ent.text
                if(neg in frase):
                    negado = "TRUE"
                if((("no " + ent.text) in frase) or ((ent.text+": no") in frase) or ((ent.text+" : no") in frase) or ((ent.text+":no")  in frase)):
                    negado = "TRUE"
                if(ent.text=="paro" and ((ent.text+" de fecales") in frase)):
                    negado = "TRUE"
            NEGADO.append(negado)
            #print("ID: ",input_table_1['ID'][i], "FRASE: ", fraseFinal,"*****ENT:", ent.text, "Negado: ", negado)            
            print("ID: ",input_table_1['ID'][i])

salida = {‘ID’:ID,‘FRASE’:FRASE,‘TARGET’:TARGET,‘NEGADO’:NEGADO}
print (“llega aquí 1”)

output_table_1 = pd.DataFrame(salida)
print (“llega aquí 2”)
knio.output_tables[0] = knio.Table.from_pandas(output_table_1)

Hi @lsandinop ,

thanks, the formatting is a bit off (you could use ``` at the beginning and end of the whole code block). But could you share the workflow? Without the sensible data of course :slight_smile:
We would like to have a closer look there.

Besides that, what is the issue with the legacy nodes? Could you tell me the issue they have on your machine?

Best regards
Steffen

Hello Steffen

This is a part of the workflow with just an small sample of the data because is sensitive information. I’m not sure it is a problem related to the code because it was working before. I had to reinstall mi pc last week and it happend since then.

About the legacy nodes I can’t remember now what happend because I had to chage them all. I will try to replicate the error and let you know.

Thanks!
ErrorWorkflow.knar (50.5 KB)

Steffen, the Legacy Python Script Node worked. I think I didn’t notice that there was a place in preferences to call the environments for Python Legacy, so It didn’t find the libraries. I think was that because I can’t remember what happend when I had to chage the nodes.

Just to let you know that I’m working with the legacy nodes for now because It worked but It seems to be slower than the new nodes.

Thanks

Hi @lsandinop,

I’m sorry, currently we can not reproduce that.
I will open a ticket, but I cannot promise any solution soon. Could you provide the following information please?

  • The conda environment (conda activate <your_environment> then conda env export)

We guess that it has something to do with the hard drive or the file system.

Glad to hear that at least the legacy nodes work.

Best regards
Steffen

Hi Steffen
There is my conda environment in a .knar file because the file extension is not authorized to upload. I appreciate it.
environment.knar (3.5 KB)

thanks

Hi,

AP-20059 has been created.

Best regards
Steffen