Azure Blob vs Parquet - BigDataFileFormatException

#1

Hi,

I reach out to you because I’m lacking ideas to troubleshoot what is a simple workflow…

I have connected Knime (desktop) to Azure Blob Storage using the Azure Blob Store Connection.
In this blob, I manually uploaded two files: an .xlsx file and a parquet file.

I have connected an Azure Blob Store File Picker in order to pick a file and assign that file to a variable (AzurePickedFile).

By connecting an Excel Reader node, I had no problem retrieving the target Excel file.

Using the same logic, I tried to retrieve a parquet file (Blob Connector >> Blob Store File Picker >> Parquet Reader). When doing so, I get an error message on the Parquet Reader node (“org.knime.bigdata.fileformats.utility.BigDataFileFormatException”).

In order to assess if the problem came from the Parquet File, I downloaded it locally and simply imported the file, using the Parquet Reader node. This did not lead to any issue.

Since neither connection to the blob nor reading the parquet file seems to be the problem, I am unable to isolate the problem.

Has anyone experienced something similar?

Thanks for your time mates

0 Likes

#2

Hi @juliendk
welcome to the KNIME community!

Unfortunately the Parquet Reader node is currently not able to reader directly from Azure Blob Storage. So the workaround for this, is to use the Download node. to ddwnload the file, and then use the Parquet Reader in local mode.

best Mareike

2 Likes

#3

Hi @mareike.hoeger,

Thank you very much for this clear answer, problem solved.

Have a great day,

Julien.

2 Likes