Spark PMML Model Predictor

#1

I am trying to apply a KNIME generated k-means model to a spark dataframe but am failing because the pmml compiler throws an exception as a result of the evaluate method being > 64k in size. My Model is built on 28 columns and fails if the k parameter is ~>50. I use a meta-clustering approach so in production I would be using a k of ~ 256. Is this fixable? Stack trace is listed below.

Best,

Aaron

Unable to compile expression
ERROR at line 89
The code of method evaluate(MainModel.Model_0_1.Data) is exceeding the 65535 bytes limit
  Line : 88  }
  Line : 89  public static Object[] evaluate(Data data) {
	at com.knime.pmml.compilation.java.compile.PMMLCompilerNodeModel.execute(PMMLCompilerNodeModel.java:98)
	at org.knime.core.node.NodeModel.executeModel(NodeModel.java:567)
	at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1186)
	at org.knime.core.node.Node.execute(Node.java:973)
	at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:559)
	at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:95)
	at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:179)
	at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:110)
	at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:328)
	at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:204)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
	at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: org.knime.ext.sun.nodes.script.compile.CompilationFailedException: Unable to compile expression
ERROR at line 89
The code of method evaluate(MainModel.Model_0_1.Data) is exceeding the 65535 bytes limit
  Line : 88  }
  Line : 89  public static Object[] evaluate(Data data) {
	at org.knime.ext.sun.nodes.script.compile.JavaCodeCompiler.compile(JavaCodeCompiler.java:330)
	at com.knime.pmml.compilation.java.compile.CompiledModelPortObject.compileModel(CompiledModelPortObject.java:174)
	at com.knime.pmml.compilation.java.compile.CompiledModelPortObject.setCode(CompiledModelPortObject.java:149)
	at com.knime.pmml.compilation.java.compile.CompiledModelPortObject.<init>(CompiledModelPortObject.java:94)
	at com.knime.pmml.compilation.java.compile.PMMLCompilerNodeModel.execute(PMMLCompilerNodeModel.java:95)
	... 13 more
0 Likes

#2

It is not the same but have you thought about using H2O.ai K-means and sparkling water.

0 Likes