Python Kernel Error on BERT Predictor

Hey there,

there is an error occurring while running the BERT Predictor - everytime at about 72 %. Before the BERT Classification Learner runs perfectly (about 36 hrs), even that the system it is working on is macOS - M2 (normally Tensorflow2 is not working on macOS but we only copied an example workflow and it functions). Knime Version is 4.7.1.

The text in the log file says:
2023-03-29 09:29:31,211 : ERROR : KNIME-Worker-66-BERT Predictor 6:11 : : Node : BERT Predictor : 6:11 : Execute failed: An exception occured while running the Python kernel. See log for details.
org.knime.python2.kernel.PythonIOException: An exception occured while running the Python kernel. See log for details.
at org.knime.python3.scripting.Python3KernelBackend.getDataTable(
at org.knime.python2.kernel.PythonKernel.getDataTable(
at se.redfield.bert.core.BertCommands.getDataTable(
at se.redfield.bert.core.BertCommands.getDataTable(
at se.redfield.bert.nodes.predictor.BertPredictorNodeModel.runPredict(
at se.redfield.bert.nodes.predictor.BertPredictorNodeModel.execute(
at org.knime.core.node.NodeModel.executeModel(
at org.knime.core.node.Node.invokeFullyNodeModelExecute(
at org.knime.core.node.Node.execute(

at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(
at org.knime.core.node.workflow.NodeExecutionJob.internalRun(
at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(
at org.knime.core.util.ThreadUtils$
at java.base/java.util.concurrent.Executors$ Source)
at java.base/ Source)
at org.knime.core.util.ThreadPool$
at org.knime.core.util.ThreadPool$
Caused by: Row key checking: Error when processing batch
at org.knime.python3.arrow.SinkManager.checkRowKeys(
at org.knime.python3.arrow.SinkManager.convertToTable(
at org.knime.python3.scripting.Python3KernelBackend.lambda$3(
at org.knime.core.util.ThreadUtils$CallableWithContextImpl.callWithContext(
at org.knime.core.util.ThreadUtils$
at java.base/ Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$ Source)
at java.base/ Source)

Here also a screenshot of a part of the workflow.

Sample data is from Misinformation & Fake News text dataset 79k | Kaggle → only DataSet_Misinfo_TRUE.csv and EXTRA_RussianPropagandaSubset.csv are used

May anybody help to make the BERT Predictor run, please? Would be very thankful for help.

Hello @educX

Could you please provide some more information regarding your setup.

  • what model are using in BERT Learner node, do you use fine-tuning or not?
  • what are the settings of the BERT Predictor node - could you provide a screenshot?
  • do you use a bundled Python environment or you have your own? If it is your own environment please share the output of python -m pip freeze command.
1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.