Row could not be converted error in CSV reader, what alternative do I have?

I am trying to read in a few CSV files that has many many rows (1.7 million rows with 13 columns). CSV node will scan the data partially and force me to use “integer” or “long” for some columns. However, there would be rows near the end where the data suddenly become “1.2” as in double or float. CSV node cannot allow me to choose that in the menu drop down. I looked another arrangement of File Reader -> Cell Splitter. However, I am having problem on how to separate the single column into multiple columns since the separator is hard to configure. Do you have any other suggestions? Thank you for your thoughts

My data looks like below (3 column example)
“field 1”, “field 2, may have comma between quotes”, “number field, which may look like int, but has double sometimes”

You can import all columns as Strings and then decide what to do. Eg replace . with , for certain columns and then convert them into double (or remove the . and convert to integer) - depending on your business case.

Then there are several other ways to import ‘messy’ CSV files. You might want to try out the new CSV reader Labs or I often had success with the R package Readr.

If you want to check out further options about KNIME and CSV I have created a collection for that:

3 Likes

Hi @dataNinja,

If I understand your problem correctly, you can just increase the number of rows that are scanned in the Advanced Settings tab. Let me know if that works for you.

Best,
Simon

2 Likes

Thank you @SimonS, I have changed in advanced setting -> Table specification -> Limit data rows scanned -> 10,000. It used to be 50.

This does the trick.

1 Like

Thank you @mlauber71. I am a noobie when it comes to using custom new nodes. For your second link, does this mean I have to find out a way to import a new node using this URL? https://hub.knime.com/knime/extensions/org.knime.features.base/latest/org.knime.base.node.io.filehandling.csv.reader.CSVTableReaderNodeFactory

Thanks.

The new CSV reader is not called “CSV Reader (Labs)” anymore, just “CSV Reader” in version 4.3 :slight_smile: The old one has been deprecated.

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.