Guys, is there any way to force the reader to treat all columns as strings?
I’m facing a serious issue. I have about 150 CSV files in a folder, and I’ve tried reading them in two ways:
Reading the entire folder at once;
Reading each file through a loop.
All files share the same structure. The problem is that in the CSV Reader, I have to manually change every single column to ‘String’. Even after doing that, if I reopen the node, it might re-interpret them as integers, forcing me to do it all over again.
A key point: my CSV files do not have headers, so I already have the ‘Has Column Header’ option unchecked.
I’ve found several threads with the same issue where users suggest an option within the node to set a default column type for all columns. If I use a loop, the situation gets even worse because the node re-evaluates the column types in every iteration. When a type changes, the workflow breaks.
My files use tab-separated fields, for example: 000000 1 00000 @ 00 22. I cannot afford to lose these configurations.
Wouldn’t it be possible to update the CSV or Excel Reader to include an option to lock all columns as String type? These situations are very frustrating. I’ve researched several workarounds on the forums, but they all just make the process more complicated.
For me, the best approach would be reading these files in a loop, but as I mentioned, it keeps breaking my loop structure.
Using the Cell Splitter with the correct delimiter settings on the first row, I obtain the proper number of columns. I then extract this value using the Extract Table Dimension node.
The trick is in the second Cell Splitter node applied to the data section of the table. I use the variable to define the number of columns (blue arrow).
I do this because, in the standard case, the table is scanned again and the data types are assigned automatically — which is something I want to avoid (red arrow).
@Felipereis50 you can take a look at this approach. Reading it all as a fixed width. splitting the columns by the separator (that has to work properly) and then splitting the first line as headers and inserting that.
Also, there is an example in there just using Python