Suggestion: File / CSV-Reader: All to string (/int/dec/etc)

Having CSV files with more than 100 columns and severel 100k of rows, the row scanning usually doesn’t work as wished. Especial with numerical fields containing human readable information like GTIN/Barcodes, zip codes, article numbers, etc. it’s always a lot of avoidable work to manually set them to string as they are not to be thread as numbers (leading 0 e.g.).

A new button “Assing all to …” would be nice, where … can even be a dropdown with the most common dataformats in the transformation tab ob the file-/csv-reader. For the daily data wrangling tasks, it would be much easier to set everything to string and convert string to int where needed on the fly.

Addition: Anyway, it would make sense to improve the auto-type cast buildin in CSV-/File-reader to set the type automatically to STRING when the first leading zero is found. Otherwise, the conversion will corrupt data.

Best greetings!

3 Likes

@Jrole your suggestion is valid. I tried my hand at a workflow that would import a CSV via:

read all data as one big column and then split the string and recover the column names. Maybe not the most elegant way. In the thread there are also examples to do this with R:

1 Like

Thanks for the answer but it doesn’t really fits my needs. :frowning:

@Jrole maybe you try to use python to force the import of all columns as strings