Replacing Umlauts for german special characters not always working

Hi,
I have a weird behaviour with the “replaceUmlauts” expression in String Manipulation node and in Column Expression node.

To be: In a table with file names I want to replace Umlauts.

Current situation: This only works for parts of the rows with special characters. Often the string content is the same, but some rows do not work.
Expression: replaceUmlauts($Dateiname_bereinigt$,false )


The yellow mark shows where it is not working.
The green mark shows where it worked as expected.

I use KNIME version 5.1.1.

Can somebody help me to solve this?

Thanks for your help.

Hello @Patrick_LW

I guess you can try another function in this node: removeDiacritic.

Another thing you can check is the encoding used for the strings you are working with. It is always better to use UTF-8 if possible. This might be the reason the characters are not recognized as characters with umlauts. Just a hypothesis: maybe the characters that were not processed are considered as Swedish characters?

In the worst case you can try using Python functions to remove the diacritics/umlauts.

4 Likes

Hello @Artem
Thanks for your suggestions. The function removeDiacritic is working.
I can continue my workflow.

About your second hint.
The strings are created by List Files/Folders node, connected to Sharepoint Online Connector. It lists all .jpg and .pdf files in a folder. I do not know, if it is possible to check or change the encoding.

If you have any ideas about that, please let me know.

I checked the settings of List Files/Folders node, there is no way you can specify character encoding there. So I assume it takes the encoding that is set up to be default in Knime or your OS.

I do not have opportunity to check how it works for Sharepoint, but perhaps somewhere in the nodes you used to connect Sharepoint there might be settings related to encoding. Otherwise the Sharepoint setting might be global for all the users and they only can be set up by the administrator.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.