csv writer issue or bug?

csv writer header bug?

I have a loop set up to cut an xlsx into chunks and write them to csv to import into one of our systems. Our system, however, tells us that the column “Forename” does not exist when we try and import. I have done various checks (even copying the same header rows from an earlier import which did work and comparing with an IF statement) and neither I nor my colleague can find a difference in the text. No additional space, nothing. I even added a column rename node to make sure it was definitely “Forename” and we are getting the same error. If we manually type Forename - or even cut it, paste it elsewhere and then cut it back to where it should be (so not changing at all, just moving it around) - it works. My colleague even just tried coping the column header cell and pasting it back in the same cell and that worked.

So the question is: is there something I have overlooked which is throwing off the formatting when writing column headers to csv? Is this a known bug? Or are we perhaps doing something wrong and not even realising? There is something apparently invisible about the raw form of the Knime output that our system does not like.

That could be an Encoding issue.

Can you post a sample of your code? Its hard to understand what the error is and where.

1 Like

How would I go about finding the code?

Chose them one by one on file reading and writing.

Hello @JWebb,

you have seen this behavior with one column (“Forename”) after specific workflow run or? In any case weird and have never stumbled across it. Have you tried CSV Writer (Labs) node?

Code is available on GitHub.

Br,
Ivan

One thing you could do is employ R and write the CSV file. Maybe not the most elegant of all solution …

Would be interesting to see a workflow where this behaviour is reproduced.

I think I’m a bit out of my depth in terms of knowledge here as I don’t know what GitHub or R are!

Ivan: it’s a variation on the chunk loop workflow which you helped me with earlier. I had to add a couple of nodes to remove columns we did not actually need, then create the spreadsheet name as a variable, then the “create file name” node to fill out the entire pathway for writing.

I have attached a screenshot of what it is doing. the column rename was my attempt to ensure that “forename” was written correctly, though it doesn’t seem to help (even though it looks perfectly fine in the output).

1 Like

I do not see what we might be able to learn from this screenshot where all nodes are green. Would it be possible to provide a minimal example that reproduces the error. Or could you provide us with the Log file set to debug mode.

Concerning R - it might come in handy if you equip yourself with it. It is a great addition to KNIME and sometimes gives you the bit extra of flexibility when KNIME does not have a solution (yet).

Hello @JWebb,

agree with @mlauber71, can’t do much from screenshots. Have you tried new CSV Writer (Labs) node? Does this happen only for one CSV file written from a specific Excel or for each Excel file there is one or more CSV files which your system doesn’t accept?

Br,
Ivan

This happens for all csv files created from that loop. I have opened them in notepad to see the difference between one I did manually (a small list which was easier to chunk manually) and one we just tried with this loop. The only difference I see is that one says “Forename” and one says Forename. All values have “” around them in the file which doesn’t work.

Is there a way I can provide the anonymised dataset I just generated (which does not work)?

Is this new node on a newer version? We’re a bit behind and I only have 4.0.2.

Hello @JWebb,

if quotes are problem then configure CSV Writer node not to use quotes :wink:
(It is done on Quotes tab.)

New CSV Writer is available in newer versions but seems you don’t need it. And why not updating to newest version?

Br,
Ivan

1 Like

I need to stress your attention back to encoding. Some times ago, I had same problem with quotes. The issue was with using default encoding instead of UTF-8. In your case could be different one.

2 Likes

Ivan: Unfortunately, version number is not under my control. I’ve only just got 4.0.2 installed and my organisation seems to think that is the most recent version.

In terms of encoding, I have tried using UTF-8 which has also caused the same error. Even a quick F2 Enter in Excel solved it and got our system to recognise it.

Could it be something to do with quotes? The one I performed F2/Enter upon opens in Notepad without quotes, but those which have not been opened/tweaked yet files where I have not yet done this have no quotes. I may try removing them and see if that changes anything!

1 Like

Hello @JWebb,

I see. Have you tried writing data without quotes? Check bottom image where this is done:

CSVQuotes

Br,
Ivan

2 Likes

I’ve changed that to “Never”, but we have been too busy today to test it. We should be able to soon, so fingers crossed :slight_smile:

1 Like

fingers crossed, otherwise sample would be really useful

1 Like

Just wanted to let you all know it was the quote issue which was the problem. My colleague re-ran the loop on some new data once I forced it to never quote and these files work fine.

Thanks for your help again everyone! :slight_smile:

3 Likes

Hello @JWebb,

glad to hear that :slight_smile:

Br,
Ivan

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.