this topic has been discussed here a lot of times, I know. But the usual solutions are not working in my case (might be that everybody pretends that). Here's my story:
I want to read a lot of csv files. There is a common set of columns, but some columns are file-specific. I use two loops to iterate over the files.
- The first loop just reads the header line of the csv files and converts them to an empty table containing these headers. I will explain that in the next paragraph.
- The second loop iterates over the files and reads the complete csv files. At the end of this loop I have to merge the various columns of all files. The Loop End node expects all data tables to have the same columns, so I concatenate the output of the first loop to add additional (empty) columns.
I have a problem reading the data in the second loop. This is what I tried:
- The CSV Reader does adjust to the unknown number of columns in the file. But it doesn't work properly, because my rows include string cells with line breaks.
- The File Reader node can handle the line breaks, but I did not find a way to adjust it to the varying number of columns. (There is no maximum number of columns specified.)
- The Line Reader node can read anything, but the Cell Splitter node doesn't consider the line breaks of the string cells and treats every row seperately.
I noticed that the number of columns of the File Reader node can be controlled by a variable. But this doesn't suffice, because the node complains about missing column configuration data. Also the line number which determines the column count can be controlled by a variable, but I didn't get that to work.
I really appreciate any hints. Thank you very much in advance.