Data Reader Damaging File

Hi,

I am planning to analyze text in Turkish.
When I add the file reader to the workflow, it doesn’t recognize the characters and damages the file.
There are 6 special characters in Turkish. CSV reader changes Turkish characters into question marks(?).

How can I solve this error?

Hi @ecodacioglu and welcome to the Knime Community.

Some characters from different languages require at least UTF-8 encoding, so make sure that you are enforcing this.

In the CSV reader, go to the Encoding tab, and enforce the UTF-8 encoding:

Hi Bruno,

Thanks for the answer. The thing is when I run(open) the reader, it changing the characters. So when I wouldn’t make any change on the reader. When I force it to UTF-8, it changes some question marks into a question mark in a black diamond. Besides, it is damaging the actual CSV file.


1 Like

Hi @ecodacioglu , can you share some sample data that we can take a look at?

FYI, it’s not damaging the file. What you see in the CSV Reader is just how it’s interpreting the data and displaying it. It’s not modifying the CSV file itself.

2 Likes

Hi @bruno29a

Thanks for the information. I have checked the excel and CSV file in the first place. After I saved the file, in order to be sure I have reopened it to see if it was saved correctly. (or I thought I was doing that).
After your warning, I repeated the process and the problem is with excel. I am using a third-party editor for files to edit and save, now it seems fine.
Thanks again.

Hi @ecodacioglu , if you enforced the UTF-8 and it’s showing you what you showed in your screenshot, then it means that your csv file itself is not properly encoded.

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.