Row filter and cyrillic symbols

Hi, :oops:
Today I trying to use Row Filter node and, as I see, there is problem with filtering rows by the string, contain cyrillic symbols.
I use text file as a source of data, and, for example, my target is all rows, where attribute "Country" contains substring "финл". I know, that sample contains such rows, but filter cann't find them.
P.S. What do you think about section "Bugs" or something like this in this forum?

Hi Max,

I'm not sure if it is a problem in the row filter or more an issue with the file reader.
To find that out, could you look at the data of the output of the node that is connected to the row filter (for example with the interactive table)?
Do you see the cyrillic symbols in the table? Or is it all cryptic and messd up?

The FileReader (assuming you used it to read in your data) uses the default character set to read the files. To be honest, I'm not sure which one that is - and anyway, it could be different (most likely) on your system and on mine. So, up to now, there is no support in the file reader for choosing the character set of the file to read.

But, you can set the default character set in the JavaVM used by KNIME. I've been playing with this, and I was able to read in a UTF-8 encoded file with cyrillic symbols - and then the row filter worked fine, btw.
In order to specify the default character set, edit the knime.ini file (or .knime.ini for Linux) - in the same dir where knime.exe is located - and add a row "-Dfile.encoding=UTF-8".
Now, that would help you, if you can save your data in UTF-8 encoding.
Unfortunately, I had no luck starting KNIME with any other of the documented character sets.

Regards,
- Peter.

Hi, Piter,

Quote:
Do you see the cyrillic symbols in the table? Or is it all cryptic and messd up?

Yes, I see all data displayed well in interactive table (as in source file).
Quote:
In order to specify the default character set, edit the knime.ini file (or .knime.ini for Linux) - in the same dir where knime.exe is located - and add a row "-Dfile.encoding=UTF-8". Now, that would help you, if you can save your data in UTF-8 encoding.

I saved file in UTF-8 and add string to ini file, but problem remains.
By the way, I had the same problems with queries to databases in Delphi, when assign Widestring variable to parameter with string type.

So, the problem is not the encoding... bummer.

When you say the attribute contains the substring "финл", how do you specify the regular expression in the row filter?
If you want to find all rows where the attribute has a value that contains "финл", then you
would specify ".*финл.*" (without the quotes).

Regards,
- Peter.

Hi, Piter,

Without any changes.... I try use .*финл*. and have no any changes.
OK, I will try to do this on other computers. May be it's my computer problems...
And then write detail message.

Best regards,
Max