Bug: cannot load PMML model to Predictor

Hi there, I built Decision Tree model (±500 features), pmml file is around 500MB. I am unable to run it on predictor (DecisionTree nor PMML), because there’s serialization issue when loading the model:

ERROR	 KNIME-Worker-10-PMML Predictor 3:151 Node	 Execute failed: Could not create PMML value.
java.lang.RuntimeException: Could not create PMML value.
	at org.knime.core.node.port.pmml.PMMLPortObject.getPMMLValue(PMMLPortObject.java:888)
	at org.knime.ensembles.pmmlpredict3.PMMLPredictorNodeModel3.execute(PMMLPredictorNodeModel3.java:134)
	at org.knime.core.node.NodeModel.executeModel(NodeModel.java:549)
	at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1267)
	at org.knime.core.node.Node.execute(Node.java:1041)
	at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:595)
	at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:95)
	at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:201)
	at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:117)
	at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:367)
	at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:221)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
	at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: java.lang.NegativeArraySizeException: -445500886
	at java.base/java.lang.String.<init>(Unknown Source)
	at java.base/java.lang.String.<init>(Unknown Source)
	at java.base/java.io.ByteArrayOutputStream.toString(Unknown Source)
	at org.knime.core.data.xml.XMLCellContent.serialize(XMLCellContent.java:257)
	at org.knime.core.data.xml.XMLCellContent.<init>(XMLCellContent.java:169)
	at org.knime.core.data.xml.XMLCellFactory.create(XMLCellFactory.java:162)
	at org.knime.core.data.xml.io.XMLDOMCellReader.readXML(XMLDOMCellReader.java:174)
	at org.knime.core.data.xml.XMLCellContent.parse(XMLCellContent.java:267)
	at org.knime.core.data.xml.XMLCellContent.<init>(XMLCellContent.java:137)
	at org.knime.core.data.xml.PMMLCellContent.<init>(PMMLCellContent.java:98)
	at org.knime.core.data.xml.PMMLCellContent.<init>(PMMLCellContent.java:82)
	at org.knime.core.data.xml.PMMLCellFactory.create(PMMLCellFactory.java:136)
	at org.knime.core.node.port.pmml.PMMLPortObject.getPMMLValue(PMMLPortObject.java:886)
	... 14 more

I’d like to file a bug report.

Thanks for the amazing software and keep up the great work!

EDIT: I reduced my model to 200MB and it barely fit into 20GB of RAM when deserializing.

Expected behavior:

  • being able to load descent-sized models
  • if this is caused by “not enough memory” I’d like to know that in error message, so I can act on it