How to read Japanese file (not using UTF-8 codepage) in file reader node?

Nirmala_Chavali · August 18, 2016, 11:14pm

I am trying to read a file containing Japanese character in KNIME workflow using file reader. None of the built-in codepages (UTF-8, UTF-16 etc) could read this file properly. Then I opened this file in Excel and Excel used 932: Japanese (Shift-JIS) to read the file fine. Is there anyway, I can use this code page to read the file in my KNIME client? Any setting I need to change so that file reader node will display this character set?

Is there anything I need to do in the Linux server to make this workflow run successfully in the KNIME server?

jonfuller · August 19, 2016, 9:19am

Hi Nirmala,

I just tested this on my KNIME Analytics Platform 3.2 and there is an option in the advanced settings, under the File Encoding tab that allows you to enter a 'user defined' file encoding in the text box. I guess that you're using an older version, I'm afraid that I can't remember in which version this functionality was added.

Best,

Jon

Nirmala_Chavali · August 19, 2016, 6:43pm

Hi Jon,

Thanks for your reply. I do see 'user defined' option in the advanced -> character decoding tab. But when I enter '932: Japanese (Shift-JIS)' and click OK, I get the error saying "Character decoding: The entered character set is not supported by this Java VM".

Nirmala_Chavali · August 19, 2016, 6:52pm

instead of using '932: Japanese (Shift-JIS)', when I used 'Shift-JIS', I could see the Japanese characters.

Thanks a lot.

Nirmala_Chavali · August 19, 2016, 7:03pm

Unforunately, for some reason, though the character set is acceptable and reading the characters correctly, I get ‘NullPointerException’ error if I click apply / OK in the file reader configure window.

Any suggestions to fix this?

Nirmala_Chavali · August 19, 2016, 7:42pm

NullPointerException error went away after resetting the nodes. Looks like, when I change the character set, file reader should be ‘yellow’.