File Reader header recognition

lucatoldo · August 26, 2009, 2:55pm

Dear All,
I have CSV files which have the first 3 rows starting with # comment,
then there is an empty line
followed by the CSV value lines, whereby the first one is the header:
#comment1 #comment2 #comment3 header1,header2,header3 10,20,30 20,30,40
Currently the File reader is not able to identify the header line, due to the “empty line”, and generates the following:
RowID Col0 Col1 Col2 Row0 header1 header2 header3 Row1 10 20 30 Row2 20 30 40
Unfortunately there is no way to skip a certain amount of lines …
The only solution I found to this problem was to … edit each file and remove the .
The result:
RowID header1 header2 header3 Row0 10 20 30 Row1 20 30 40
This is not nice if one have several files … any advice ?
p.s. sorry I do not get the formatting right …

Peter · August 26, 2009, 3:19pm

This is a known bug. I don’t have a good workaround.
You could go to the “Advanced…” settings, and allow short lines, in the “Short Lines” tab. Also check “read column headers”. This probably reads in your data - but has consequences you may not like:

your columns are named “Col0”, “Col1” - and not as stored in the file.
In int or double columns the missing value pattern is set to the column header from your file.
If you have a missing value in your file (like “-” or “x”), reading will fail (because you would need two missing value patterns, which is not possible)
You have an extra first line in the table with missing values (or the headers from your file, in case your data file contains string columns). You could filter that line out - that is easy. (And, no, it is not possible to rename columns to the values in a specific row. Not yet anyway.)
If your file in fact contains short lines (that is lines with not enough data items), it will read them in and generate missing values for those missing items. (This might be acceptable - but normal file reader behavior is to fail on those short lines.)
Not sure if this is really helpful…

Peter

lucatoldo · August 26, 2009, 3:32pm

Dear Peter, thanks for your fast feedback. The workaround I did worked fine, thankyou. luca