Hi,
I am trying to run the Spark Predictor (Classification) node with a model created on GCP Dataproc and I have this error:
ā2019-12-31 10:41:54,991 : ERROR : KNIME-Worker-29-Spark Predictor (Classification) 2:2678 : : Node : Spark Predictor (Classification) : 2:2678 : Execute failed: empty collection (UnsupportedOperationException)
java.lang.UnsupportedOperationException: empty collection
at org.apache.spark.rdd.RDD$$anonfun$first$1.apply(RDD.scala:1380)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.first(RDD.scala:1377)
at org.apache.spark.ml.util.DefaultParamsReader$.loadMetadata(ReadWrite.scala:615)
at org.apache.spark.ml.util.DefaultParamsReader$.loadParamsInstance(ReadWrite.scala:650)
at org.apache.spark.ml.Pipeline$SharedReadWrite$$anonfun$4.apply(Pipeline.scala:274)
at org.apache.spark.ml.Pipeline$SharedReadWrite$$anonfun$4.apply(Pipeline.scala:272)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
at org.apache.spark.ml.Pipeline$SharedReadWrite$.load(Pipeline.scala:272)
at org.apache.spark.ml.PipelineModel$PipelineModelReader.load(Pipeline.scala:348)
at org.apache.spark.ml.PipelineModel$PipelineModelReader.load(Pipeline.scala:342)
at org.apache.spark.ml.util.MLReadable$class.load(ReadWrite.scala:380)
at org.apache.spark.ml.PipelineModel$.load(Pipeline.scala:332)
at org.apache.spark.ml.PipelineModel.load(Pipeline.scala)
at org.knime.bigdata.spark2_4.jobs.namedmodels.NamedModelUploaderJob.runJob(NamedModelUploaderJob.java:55)
at org.knime.bigdata.spark2_4.jobs.namedmodels.NamedModelUploaderJob.runJob(NamedModelUploaderJob.java:1)
at org.knime.bigdata.spark.local.wrapper.LocalSparkWrapperImpl.runJob(LocalSparkWrapperImpl.java:123)
at org.knime.bigdata.spark.local.context.LocalSparkJobController.lambda$1(LocalSparkJobController.java:92)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)ā
I used Spark Random Forest Learner node on GCP to create the model and I saved it on my local and than used a Model Reader node to read the model and plugged that node to the Spark Predictor(Classification) node. I tried to run the Predictor node on a local context and on a GCP context and the same result.
Thank you,
Mihai