Parquet Brotli Compression Format Missing Class Implementation

Problem

Selecti BROTLI file compression in the Parquet writer. Execute the node. An error is generated stating the BrotliCodec was not found.

Note: The class implementing LZ4 is also missing. However, as this is deprecated it should be removed from the list of options if it is no longer supported. All other file compression algorithms are working as intended.

Version

KNIME Extension for Big Data File Formats 4.6.0.v202205251302

Logfile

2022-10-29 17:09:07,005 : ERROR : KNIME-Worker-34-Parquet Writer 5:38 :  : Node : Parquet Writer : 5:38 : Execute failed: Class org.apache.hadoop.io.compress.BrotliCodec was not found
org.apache.parquet.hadoop.BadConfigurationException: Class org.apache.hadoop.io.compress.BrotliCodec was not found
	at org.apache.parquet.hadoop.CodecFactory.getCodec(CodecFactory.java:243)
	at org.apache.parquet.hadoop.CodecFactory$HeapBytesCompressor.<init>(CodecFactory.java:144)
	at org.apache.parquet.hadoop.CodecFactory.createCompressor(CodecFactory.java:208)
	at org.apache.parquet.hadoop.CodecFactory.getCompressor(CodecFactory.java:191)
	at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:296)
	at org.apache.parquet.hadoop.ParquetWriter$Builder.build(ParquetWriter.java:658)
	at org.knime.bigdata.fileformats.parquet.ParquetFileFormatWriter.<init>(ParquetFileFormatWriter.java:133)
	at org.knime.bigdata.fileformats.parquet.ParquetFormatFactory.getWriter(ParquetFormatFactory.java:199)
	at org.knime.bigdata.fileformats.node.writer2.FileFormatWriter2NodeModel.createWriter(FileFormatWriter2NodeModel.java:289)
	at org.knime.bigdata.fileformats.node.writer2.FileFormatWriter2NodeModel.writeToFile(FileFormatWriter2NodeModel.java:257)
	at org.knime.bigdata.fileformats.node.writer2.FileFormatWriter2NodeModel.write(FileFormatWriter2NodeModel.java:214)
	at org.knime.bigdata.fileformats.node.writer2.FileFormatWriter2NodeModel.execute(FileFormatWriter2NodeModel.java:185)
	at org.knime.bigdata.fileformats.node.writer2.FileFormatWriter2NodeModel.execute(FileFormatWriter2NodeModel.java:1)
	at org.knime.core.node.NodeModel.executeModel(NodeModel.java:549)
	at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1267)
	at org.knime.core.node.Node.execute(Node.java:1041)
	at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:595)
	at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:95)
	at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:201)
	at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:117)
	at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:367)
	at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:221)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
	at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.compress.BrotliCodec
	at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source)
	at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown Source)
	at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
	at org.eclipse.osgi.internal.framework.ContextFinder.loadClass(ContextFinder.java:147)
	at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
	at org.apache.parquet.hadoop.CodecFactory.getCodec(CodecFactory.java:237)
	... 25 more
1 Like

Hi @DiaAzul,

Thanks reporting this! The parquet writer supports only uncompressed, Snappy, GZip and ZDST. Mistakenly, the node lists other compression algorithms too. Going to remove them from the list in a future version.

Cheers,
Sascha

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.