.parquet vs .table formats for file storage

Previously I’ve always stored data from KNIME in the proprietary .table format.

With recent versions support for .parquet files is far better, and from some quick experiments they seem to load/save faster than .table format and they take up far less size. And parquet can be used by other programs too. So is there any reason I shouldn’t just use .parquet files over .table from now on?

The only one I’ve found is that I can’t drag-and-drop a .parquet file from the KNIME Explorer window, but this is hardly a deal breaker for me.

Thanks

Hi,
If you have only “normal” data types, using the Parquet Reader should not be a problem at all. It does not support everything you can put into a KNIME table, though. I just tested it with the workflow output of a Capture Workflow End by putting that into a Model to Cell node and passing the output of that to the Parquet Writer and it complained about an unsupported data type. This is probably the case with other types as well, but if you are dealing with simple things like Strings and Doubles, you should be fine.
Kind regards,
Alexander

7 Likes

@dpowyslybbe
interesting, can you elaborate on what kind of performance gain we are talking about?
Thanks

Thanks, that does make a lot of sense. Whilst I use other data types in workflows typically data in/out is in more standard data types like strings, doubles, date/time, etc.

1 Like

I’ve done no scientific experiments, but on a couple of tests I did .parquet seemed to read/write about twice as fast as .table. For context the file was ~20 million rows and 30 columns in size.

5 Likes

@dpowyslybbe

At the moment I would stick to the .table format of KNIME for intermediate storage since it is more ‘native’ to KNIME and can store more formats.

If you plan to share data parquet or ORC might indeed be an option that would keep your formats 'alive

If you are interested in more tweaks you could check ot this collection and also try the parquet internal storage (which still is in official beta I think).

2 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.