Parquet format reader/writer separate from Spark

bfrutchey · May 10, 2018, 1:00pm

Is there a node which supports the writing or reading of Parquet files without connecting to Spark? I have created a basic stand-alone Parquet Reader and Parquet Writer node, but they only handle basic Knime DataCell types (numeric and string) and can run out of memory when working with large Parquet files. Please let me know if there are other stand-alone options I can use to read and write Parquet files. I am similarly interested in other big data file format reader/writers, like Avro. Thanks!

RolandBurger · May 11, 2018, 8:44am

Hi,

There are no nodes for this in the current release, but there is a set in development right now. This will include reader/writer nodes for Parquet, ORC, and Avro.

Cheers,
Roland

szawadski · November 5, 2018, 10:43am

Hello @RolandBurger,
Any details about the availability of these nodes?
Regards,
Sébastien

ipazin · November 5, 2018, 10:53am

Hi there!

I’m no Roland but maybe this link will help:
https://www.knime.com/whats-new-in-knime-36#parquet-orc

Br,
Ivan

bfrutchey · November 5, 2018, 12:12pm

Indeed 3.6 has the nodes I was seeking!

system · June 2, 2023, 9:01pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.