Cell splitter delimited wrong on Chinese punctuation marks

I want to splitter a cell, who has a string with chinese comma as delimiter. So I configured the cell splitter:


But this method is error:

> ERROR Cell Splitter        0:767      Execute failed: Add Delimiter: 
> The delimiter must begin with a plain ASCII character (ascii code < 127)

What can I do with this kind of problems? Thanks.

Hi @Long , based on the error message, if that character’s ASCII code is >= 127, then it looks like Knime does not accept it as delimiter in the Cell Splitter.

Is it possible to replace this character by something else? For example, you could use the Line Reader and do a replace.

Could you share some sample data that we can test with?

1 Like

Thanks, Bruno.
The data likes “浦东新区1930例,徐汇区998例,闵行区774例,黄浦区632例,嘉定区591例,松江559例,宝山区450例,普陀区349例,虹口区312例,静安区298例,杨浦区255例,青浦区228例,奉贤区149例,崇明区110例,长宁区93例,金山区60例”

Thanks for the sample data @Long .

As per my suggestion, something like this will do:
image

In the String Manipulation, I changed the Chinese comma to “English” comma:
replace($column1$, ",", ",")

And then I am able to use the Cell Splitter delimited by the “English” comma.

Results:

Here’s the workflow: Cell splitter on Chinese punctuation.knwf (8.3 KB)

5 Likes

@Long I experimented with putting Unicode characters in flow variables and using that on UTF-16LE csv files which does work. The flow variable though can not be used in cell splitter.

I put it up anyway:

1 Like

Thanks @bruno29a. It works.

Thanks @mlauber71 Provide me a new method with Java.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.