CSV output encoding type

Hi everyone,

I have a problem like this:
with this string data:
“Khách hàng không yêu cầu hóa đơn”
I tried to export it with CSV writer with all types of encoding but it’s still wrong. This is the output from options of CSV writer nodes.

Thank you for all your help.

image

Hi @huynhduc , if I set my encoding on CSV Writer to UTF-8, it writes it as this:

“Khách hàng không yêu cầu hóa đơn”

which I believe is correct.

How are you testing the result? Make sure if you are reading it back with CSV Reader (or an external app) you also have encoding specified to UTF-8 on the reader

Yes @takbb

it’s supposed to be UTF-8, but I just create a data like this, with setup. And showed that error.






Hi @huynhduc , what I’m getting at is how do you know the CSV Writer isn’t writing it correctly?

Presumably you are reading it back to verify?

Are you sure it is not the Reader that is incorrect? How are you Reading the file to verify?

Hi @takbb

After I executed, I opened the CSV file on my PC and found that is wrong.

Ok, same question though… what did you open it with. How do you know the app you opened it with is reading it correctly as UTF-8?

You are only assuming it is not being written correctly. How do you know with certainty that it is being read correctly? You might be right, that it is not being written as UTF-8, but to find the error, you need to first prove where the error is.

here is the output file from KNIME, I just literally opened it and found that problem.

Thanks for the additional info. So I would say that the problem is in the reading application (Excel)

Have you looked at the file using any other applications, such as notepad?

When I write the text you gave as UTF-8 with CSV Writer, it IS being written correctly.

If I open it in notepad on my PC, I see the same text I expected:

image

If I open it directly in Excel… it is Excel that gets it wrong

image


If I tell Excel to import the data from an external file, and then go through the import text/csv process, and tell Excel it is UTF-8… then Excel comes to the party…


image

Additional references:

stackoverflow.com -Is it possible to force excel recognize utf-8 csv files automaticallyhttps://stackoverflow.com/questions/6002256/is-it-possible-to-force-excel-recognize-utf-8-csv-files-automatically

answers.microsoft.com - How to open UTF-8 CSV file in Excel without mis-conversion of characters in Japanese and Chinese language for both Mac and Windows?

2 Likes

Thanks @takbb for your clarifying.

I told Excel to import the data like you, it worked.

However, assume that I do not open the Excel.csv file, is it completely right encoded all the time? or we should re-check like this after executed?

HI @huynhduc , I’m not in position to make any kind of guarantee that encoding will be right all the time, but I see no reason to believe (based on available evidence) that the file will not be encoded correctly.

If you have concerns, I think the simplest way to visually check that it has been encoded ok, would be to do what I did in my initial reply to you

Simply read the file back using a CSV Reader, with encoding set to UTF-8 and compare the results.

1 Like

3 posts were split to a new topic: Line Plot Readability Fix