Need help reading a file

Hey folks,

unfortunately I struggle hard reading a data file into KNIME. Please see the attached file to find out what I mean.

I am using the File Reader node with all kind of options and I tried this and that. I am pretty sure, that I overlooked something, so help is really appreciated.

The problem seems to be the very last line of the file and the error message says "Execute failed: Too many data elements (line: 41 (Row23)...". It's funny, because a message like "too few" would have made more sense.

Note:

I know, that I have to skip the first 16 lines, I know that there is the option to ignore short lines and that I can ignore extra delimiters at the end of a row. But no combination works. At least for me. Tested under Windows and Mac.

Best,
Marc

Hi Marc,

I am afraid that is a bug in the file reader: If the last line has no line feed at the end and contains no value in the last item, it fails.

If you can add a new line (CR, LF, Enter) at the end of the file, that would help.

Or reading the file with the default settings (i.e. no delimiter - and skipping the first 16lines maybe) and using the "Cell Splitter" node afterwards (with comma as a delimiter), would also be a workaround.

Sorry for that.

 - Peter.

Another workaround for this special case: limiting number of data rows to 23.

Frank

Hello Peter and Frank,

thank you for your help. Adding a line feed helps as well as deleting the last comma. Unfortunately I have hundreds of these files and they all have different amounts of rows. So I think, the workaround suggested by Peter is the only chance at the moment.

However, I hope the KNIME team can fix this bug. :-)

Best,
Marc

Brilliant idea, never heard of that tool before. THANKS!

Hi

or you can use the component "Delimited text reader" from the  free actian extension.

You can easily prepare all your files from the shell.

I am assuming your datafiles live in one directory and you have "gnu parallel" installed. (if you have not, go install it)
 

find /opt/data/*.txt | parallel 'printf "\n" >> {}'

 

This will add a newline to all .txt files in /opt/data/

 

You might also be able to do it with a shell-loop or xargs. Gnu parallel is basically a replacement for xargs without all its quirkyness. More info here: http://www.youtube.com/watch?v=OpaiGYxkSuQ