Encoding Column Names with Special Characters in Python

I have column names with characters other than ASCII e.g. °C or german umlauts that I cannot use fully in python nodes. Data is successfully loaded into the node and I can view the DataFrame with the correct encoding. However, when accessing the variable I get KeyError: ('t1 (\xc2\xb0C)', u'occurred at index Row0')

Code: output_table['h1 (kJ)'] = output_table.apply(lambda x: enthalpy(x['t1 (°C)'], x['h1']), axis=1)

The variable name seems to be decoded into utf-8 when it is passed to python. Is this a limitation by python 2 that it can only use ascii variables or am I missing something? Thank you!

Python 2.7.11 of anaconda 2. Knime 3.1.2

So you try to access the variable through code you've input into the python scripting node ?

The problem may stem from the coding pane. I've noticed for the R nodes that non-ASCII characters written in code (in the coding pane of the R node) does somehow not appear to be sent correctly to R - you would never be able to successfully e.g. search for a string with ä ö é è etc. within the data. On the other, data with such characters sent to R and back does not suffer from this problem.

Yes, I try to access a variable through code in the scripting node that has characters out of ascii. (In another example I get Execute failed: 'ascii' codec can't encode character u'\xdf' in position 4: ordinal not in range(128) for a variable with Character ß).

What you describe sounds very simlilar to what I encounter. Could this be a general issue? Are you using Windows as well?

I've tested the issue in both MacOS X and Windows and it is not platform dependent. Even the text encoding preferences in KNIME do not appear to impact the issue.

Interesting. I was also trying the following statements in the editor without luck. Where do we go from here?

-*- coding: utf-8 -*-

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.