Read Excel 97-2003 files that have a format problem with python

Hallo,

i want to read the content of the following example file with pandas in python. Its a Excel 97-2003 file.
20230714141942.xls (1.6 MB)
after applying “pandas.read_excel(file)” i first get asked for a engine, after using ‘xlrd’ following Error is raised:
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b’Time\t P1’

Normaly pandas should open a xls file without the need of an engine, so there is defenitelly a problem with the format.

Thank you for your help!

@Marks123 welcome to the KNIME forum. Have you tried Openpyxl? I am not at my computer so I cannot test right now.

Another option could be to try R but if indeed there are problems in the file this might not work:

1 Like

Hallo @mlauber71

thank you for your answer!
Yes i have tried also this options. With openpyxl as engine, it is expecting a zip file.

@Marks123 you could try to use this library to import older Excel Files. I thought the new Excel Node should support older files. I will try something later.

This file is just a text file so you can just use the File Reader to read it …

the Excel Import function in gdata has been deprecated. Will have to update the workflow.

4 Likes

Was interested so I tried and can confirm @mlauber71 hunch.
When changing the file type and read it with python it works fine

If for whatever reason you want to use python you are good to go

br

3 Likes

@Marks123 here are a few examples how to deal with Excel 1997-2003 Files. As well as an import of your original file. Some adaptions concerning types are also there:

Excel 97-2003 Import - KNIME Forum (77371).knwf (578.8 KB)

@mlauber71 and @Daniel_Weikert thank you very much for your answers, this little detail with the csv format helpes a lot!!

As for now i do not understand why this, via labview created files, are shown as .xls files (also when checking the properties) but nevertheless i am now able to proceed, thanks!