AVRO file reader?

rsherhod · September 12, 2019, 4:29pm

Hi,

Is there are an AVRO file reader node, or plans to add one?
I see the big data extension includes Parquet and ORC reader/writers, but not AVRO.

Cheers,

Richard

Edit: AVRO is even mentioned in the “Blend data from any source” section of the Analytics Platform page. Am I missing something?

beginner · September 13, 2019, 12:44pm

I suspect that links means the " Avro to Spark" Node which might not really be what you are looking for.

rsherhod · September 13, 2019, 1:03pm

Yeah, I’m after a local file reader/writer node. As I said, there are Parquet and ORC ones.

Iris · September 13, 2019, 1:22pm

Hi Richard,

I made you an example how you can read and write AVRO files with KNIME directly to your local disk.

https://hub.knime.com/iris/space/ExamplesForTheForum/2019_09_13_ReadAvroFilesWithKNIME

Cheers, Iris

rsherhod · September 16, 2019, 2:46pm

Hi Iris,

Thanks for the suggestion. Unfortunately the Avro to Spark node is throwing:

java.lang.NullPointerException
at org.knime.bigdata.spark2_4.api.TypeConverters.getConverter(TypeConverters.java:121)
at org.knime.bigdata.spark2_4.api.TypeConverters.convertSpec(TypeConverters.java:162)
at org.knime.bigdata.spark2_4.jobs.genericdatasource.GenericDataSource2SparkJob.runJob(GenericDataSource2SparkJob.java:82)
at org.knime.bigdata.spark2_4.jobs.genericdatasource.GenericDataSource2SparkJob.runJob(GenericDataSource2SparkJob.java:1)
at org.knime.bigdata.spark.local.wrapper.LocalSparkWrapperImpl.runJob(LocalSparkWrapperImpl.java:121)
at org.knime.bigdata.spark.local.context.LocalSparkJobController.lambda$1(LocalSparkJobController.java:92)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

The AVRO contains quite complex objects, and the schema does permit null values in attributes.
I don’t think I can post the AVRO here, since it contains data that we’ve licensed. I can send it directly somewhere though.

Cheers,

Richard

system · March 17, 2020, 2:46am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.