Parquet to Spark - Execute failed: Unsupported class file major version 55 (IllegalArgumentException) java.lang.IllegalArgumentException: Unsupported class file major version 55

Trying to read data from Parquet files which are stored in a HDFS using spark context livy. But getting the following error upon execution of the node “Parquet to Spark”:
Execute failed: Unsupported class file major version 55 (IllegalArgumentException)
java.lang.IllegalArgumentException: Unsupported class file major version 55

The detailed logs are as below:

2020-10-07 21:32:09,271 : ERROR : KNIME-Worker-14-Parquet to Spark 0:67 : : Node : Parquet to Spark : 0:67 : Execute failed: Unsupported class file major version 55 (IllegalArgumentException)
java.lang.IllegalArgumentException: Unsupported class file major version 55
at org.apache.xbean.asm6.ClassReader.(ClassReader.java:166)
at org.apache.xbean.asm6.ClassReader.(ClassReader.java:148)
at org.apache.xbean.asm6.ClassReader.(ClassReader.java:136)
at org.apache.xbean.asm6.ClassReader.(ClassReader.java:237)
at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:50)
at org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:845)
at org.apache.spark.util.FieldAccessFinder$$anon$4$$anonfun$visitMethodInsn$7.apply(ClosureCleaner.scala:828)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:134)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:134)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at org.apache.spark.util.FieldAccessFinder$$anon$4.visitMethodInsn(ClosureCleaner.scala:828)
at org.apache.xbean.asm6.ClassReader.readCode(ClassReader.java:2175)
at org.apache.xbean.asm6.ClassReader.readMethod(ClassReader.java:1238)
at org.apache.xbean.asm6.ClassReader.accept(ClassReader.java:631)
at org.apache.xbean.asm6.ClassReader.accept(ClassReader.java:355)
at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:272)
at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:271)
at scala.collection.immutable.List.foreach(List.scala:392)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:271)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:163)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2326)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2100)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:990)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
at org.apache.spark.rdd.RDD.collect(RDD.scala:989)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.mergeSchemasInParallel(ParquetFileFormat.scala:633)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat.inferSchema(ParquetFileFormat.scala:241)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$6.apply(DataSource.scala:180)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$6.apply(DataSource.scala:180)
at scala.Option.orElse(Option.scala:289)
at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:179)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:373)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at org.knime.bigdata.spark2_4.jobs.genericdatasource.GenericDataSource2SparkJob.runJob(GenericDataSource2SparkJob.java:76)
at org.knime.bigdata.spark2_4.jobs.genericdatasource.GenericDataSource2SparkJob.runJob(GenericDataSource2SparkJob.java:1)
at org.knime.bigdata.spark2_4.base.LivySparkJob.call(LivySparkJob.java:94)
at org.knime.bigdata.spark2_4.base.LivySparkJob.call(LivySparkJob.java:1)
at org.apache.livy.rsc.driver.BypassJob.call(BypassJob.java:40)
at org.apache.livy.rsc.driver.BypassJob.call(BypassJob.java:27)
at org.apache.livy.rsc.driver.JobWrapper.call(JobWrapper.java:64)
at org.apache.livy.rsc.driver.BypassJobWrapper.call(BypassJobWrapper.java:45)
at org.apache.livy.rsc.driver.BypassJobWrapper.call(BypassJobWrapper.java:27)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)

Hello,

This may be a java environment variable or version issue.

Please provide the following information for troubleshooting:

  • What version(s) of KNIME Analytics Platform (AP), KNIME Server (KS), and KNIME Executor (KE) are being used?
  • What operating system is your KNIME software running on?
  • What version of Java is being used (e.g. java -version output)?

Regards,
Nickolaus

1 Like

Maybe not a solution for this bug but you could try and bring the parquet file via an Hive or Impala external table to spark.

@saurabhgoals101 The error message indicates that this is indeed a Java version problem, as @NDekay stated. However the stack trace clearly indicates that this has nothing to do with the java version that KNIME AP or KNIME Server is using to connect. The error happens inside your cluster, not in KNIME AP, it just gets logged in KNIME AP because we are transferring cluster-side errors into the AP and logging them there (so that the problem is easier to diagnose).

It is a cluster-side issue where Spark runs on the Java 8 JVM, but is trying to access some code that was compiled with Java 11 (hence “Unsupported class file major version 55”). I am not sure how this can happen, but it is not a problem with KNIME software, but a problem that needs to be fixed in your cluster setup. What type of cluster are you using? (Cloudera CDH, HDP, CDP, or Amazon EMR, …). If it is from one of the big vendors it might be worth contacting their support about it, or ask in their support forums.

3 Likes

Thanks for the response. The problem was resolved. It was due to spark and livy version mis match. Spark version is 2.4 and installed livy image supports spark 2.3. Upgraded livy to support spark 2.4 and the issue was resolved.

3 Likes