AVRO file reader?

#1

Hi,

Is there are an AVRO file reader node, or plans to add one?
I see the big data extension includes Parquet and ORC reader/writers, but not AVRO.

Cheers,

Richard

Edit: AVRO is even mentioned in the “Blend data from any source” section of the Analytics Platform page. Am I missing something?

1 Like

#2

I suspect that links means the " Avro to Spark" Node which might not really be what you are looking for.

0 Likes

#3

Yeah, I’m after a local file reader/writer node. As I said, there are Parquet and ORC ones.

0 Likes

#4

Hi Richard,

I made you an example how you can read and write AVRO files with KNIME directly to your local disk.

Cheers, Iris

4 Likes

#5

Hi Iris,

Thanks for the suggestion. Unfortunately the Avro to Spark node is throwing:

java.lang.NullPointerException
at org.knime.bigdata.spark2_4.api.TypeConverters.getConverter(TypeConverters.java:121)
at org.knime.bigdata.spark2_4.api.TypeConverters.convertSpec(TypeConverters.java:162)
at org.knime.bigdata.spark2_4.jobs.genericdatasource.GenericDataSource2SparkJob.runJob(GenericDataSource2SparkJob.java:82)
at org.knime.bigdata.spark2_4.jobs.genericdatasource.GenericDataSource2SparkJob.runJob(GenericDataSource2SparkJob.java:1)
at org.knime.bigdata.spark.local.wrapper.LocalSparkWrapperImpl.runJob(LocalSparkWrapperImpl.java:121)
at org.knime.bigdata.spark.local.context.LocalSparkJobController.lambda$1(LocalSparkJobController.java:92)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

The AVRO contains quite complex objects, and the schema does permit null values in attributes.
I don’t think I can post the AVRO here, since it contains data that we’ve licensed. I can send it directly somewhere though.

Cheers,

Richard

0 Likes