Hi,
I created an H2O Sparkling Water Context over a Spark Livy Context using only KNIME nodes on GCE Dataproc and I have this error at the H2O Random Forest Learner:
“ERROR : KNIME-Worker-29-H2O Random Forest Learner 3:2688 : : Node : H2O Random Forest Learner : 3:2688 : Execute failed: Job crashed unexpected. Cause: org.knime.bigdata.spark.core.exception.KNIMESparkException: error cannot be computed: too many classes (UnsupportedOperationException) See log for details.
org.knime.ext.h2o.exception.H2OJobCrashedException: Job crashed unexpected. Cause: org.knime.bigdata.spark.core.exception.KNIMESparkException: error cannot be computed: too many classes (UnsupportedOperationException) See log for details.
at org.knime.ext.h2o.jobs.DefaultH2OJobFuture.get(DefaultH2OJobFuture.java:98)
at org.knime.ext.h2o.jobs.AbstractH2OExecutionContext.submit(AbstractH2OExecutionContext.java:76)
at org.knime.ext.h2o.context.DefaultH2OSession.futureOf(DefaultH2OSession.java:213)
at org.knime.ext.h2o.context.DefaultH2OSession.run(DefaultH2OSession.java:369)
at org.knime.ext.h2o.nodes.learner.drf.H2ODRFNodeModel3.run(H2ODRFNodeModel3.java:86)
at org.knime.ext.h2o.nodes.learner.drf.H2ODRFNodeModel3.run(H2ODRFNodeModel3.java:1)
at org.knime.ext.h2o.nodes.AbstractH2OSupervisedNodeModel.execute(AbstractH2OSupervisedNodeModel.java:157)
at org.knime.core.node.NodeModel.executeModel(NodeModel.java:576)
at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1236)
at org.knime.core.node.Node.execute(Node.java:1016)
at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:558)
at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:95)
at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:201)
at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:117)
at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:334)
at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:210)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: java.util.concurrent.ExecutionException: org.knime.bigdata.spark.core.exception.KNIMESparkException: error cannot be computed: too many classes (UnsupportedOperationException)
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.knime.ext.h2o.spark.H2OSparkJobFactory$H2OSimpleSparkJob$1.checkException(H2OSparkJobFactory.java:221)
at org.knime.ext.h2o.spark.H2OSparkJobFactory$H2OSimpleSparkJob$1.(H2OSparkJobFactory.java:175)
at org.knime.ext.h2o.spark.H2OSparkJobFactory$H2OSimpleSparkJob.getStatus(H2OSparkJobFactory.java:173)
at org.knime.ext.h2o.jobs.DefaultH2OJobFuture.get(DefaultH2OJobFuture.java:87)
… 19 more
Caused by: org.knime.bigdata.spark.core.exception.KNIMESparkException: error cannot be computed: too many classes (UnsupportedOperationException)
at org.knime.bigdata.spark2_4.base.LivySparkJob.call(LivySparkJob.java:106)
at org.knime.bigdata.spark2_4.base.LivySparkJob.call(LivySparkJob.java:1)
at org.apache.livy.rsc.driver.BypassJob.call(BypassJob.java:40)
at org.apache.livy.rsc.driver.BypassJob.call(BypassJob.java:27)
at org.apache.livy.rsc.driver.JobWrapper.call(JobWrapper.java:64)
at org.apache.livy.rsc.driver.BypassJobWrapper.call(BypassJobWrapper.java:45)
at org.apache.livy.rsc.driver.BypassJobWrapper.call(BypassJobWrapper.java:27)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)”
I am having the input data with 8360 columns and 3841 categories. Do you have any suggestions?
Thank you,
Mihai