Reading multiple ASCII files from a folder

I am trying to read a folder that contains 954 ASCII files, each file contains 4 columns and 20540 rows of data (a high sampling frequency vibration data signal). I want to read one file at a time from the folder, work out the average,SD, max, min, variance etc (statistics node) of the first column of that single file, record/save it as a single row and continue looping around the folder until all 954 files are read and processed.

How would I create a workbench model that can implement what I described? I tried using the following approach:

List files (read the folder containing all 954 files)

Table Row to Variable Loop Start (looping each file)

File Read (read each file)

Loop End

However, I keep getting error on the File Reader node. I followed the approach of going into flow variables in File Read and selected the URL variable from the dropdown menu next to filename section. It just comes out with loads of errors. I have attached photos of my approach and errors. Any Help on solving the issue or a different approach from what I am trying to do will be greatly appreciated. 

Many Thanks

Liam

Hi Liam,

I'm using almost the same approach right now, without any problems. So the approach is good, but there seems to be a devil in the details. First of all, the warning messages look like the paths to the files might be creating errors. That might sound odd seeing how they come directly from the List Files node, but I've encountered problems with URLs in Knime in the past. It seems for example, that the handling of "special" characters within URLs is inconsistent. If that is the case, you could try moving your files to a location without "special" characters in the path, or convert the URLs to strings before starting the loop.

It might also be that the URLs work, but that the File Reader's "auto guessing" comes in the way, so replacing it with e.g. a CSV Reader might be something to try. I'm not normally using the File Reader, as there are more specialized ones for my use cases, but I think it has some special pecularities that don't always fit the general models in Knime very good.

Can you post the complete warning and error messages that you get from the File Reader?

 

 

Hi Marlin,

Thanks for your reply. I will have a go at changing the file location. With regarding using the CSV reader, my files are ASCII files, will a CSV reader able to read that?

Hi thor,

Thanks for your reply, please see attachments for the full error message. I had to scroll horizontally to reach the end of the error message, so I took a screenshot of the beginning of error message and end of error message. The middle section is pretty much a repeat of the error message at beginning, just with other file names.

 

 

 

Liam,

I don't know, because I don't know the details of your format. But here's a link to what csv is. It certainly can be ASCII, but there's more to it than that.

Actually the file format might be important for the search. I don't know if it would help the experts (thor), but it certainly can't hurt to upload a sample file.

Hm, it looks like the File Reader is actually trying to read the file but stumbles across a supposed numeric column. From the error output it looks like there are some strange characters in the file. It seems it doesn't recognize the line breaks correctly. Can you post an exerpt from one of the files (as attachment)?

The files are ASCII files, I had to change it to a .txt file for it to get uploaded. The file I attached is one of the 954 files in the folder. File reader can open a single file no problem, when going through the loop, it goes crazy.

Also the URL location from 'List Files' has the ending of '.DS/store', when I tried to open that, just a bunch of symbols. 

I think the .DS/store file is special to Mac systems. You should not try to read them. Most probably that causes the problem. Could you filter it out?

Cheers, gabor

Yes, I think you are right, I used terminal to delete the .DS_Store file in the folder and it worked!! Thanks :) but now I get an error from the loop end node saying "ERROR     Loop End                           Execute failed: Encountered duplicate row ID  "Col0" at row number 2", Is it possible to ignore duplicates?

I'm glad to hear that it worked. But your Mac might create a new .DS_Store, so a more robust method would be to filter inside your workflow. The List Files node has an option to filter by file extension while reading, so that might be an option. But you could also filter the table of URLs after reading.

Duplicates in row IDs can not be ignored, because they have to be unique, but you can activate the "Uniquify row IDs" setting in the Loop End, which probably does what you want