Hi,
I have a table containing about 16M rows. I wan to export it in csv using the CSV writer. The node runs and does not return any warning.
The issue is when I read the file, I get an error:
"Execute failed: New line in quoted string (or closing quote missing). In line 21662."
Indeed, when I look at the content in nano:
"ZINC56683490","C(=N
1cnnc1-c1ccn[nH]1)c1ccccc1",35,"ZINC33332545",40020.0
I wrote twice the file and I got the same error at the same lines. I also try to write the files without quotes and replacing the separator by \t but the problem still occurs at the same position.
Did I miss something or the node is not able to write correctly a csv file?
Thank you in advance.
Nicolas
PS: Knime 3.1.2, Ubuntu 14.04 64bits
I have found a workaround which is a little dangereous but works in my case. It is to allow multi line quoted strings.
Hope this will help.
Nicolas
looks like the csv node is interpreting a part of the the chemical notation of *\n* (where both stars can be anything).
Traditionally \n means newline, so that is not that strange that it is doing that.
You could try to work around it by replacing \ with its escaped form \\ before exporting, but there you are opening a can of worms.
Allowing multilined quoted strings is not a good solution either becouse the compound was changed by replacing the \n with a newline, so a bond and an atom is missing.
Personally i always stick to sdf for compound data.
You are right, the \n contained in the string could explain this error and I did not think about it. However, in the example above there is no such character.
The problematic pattern seems to be c or n followed by 1 followed by lowercase c or n, but I did not look for every possibility.