I am trying to utilize LightGBM in one of Knime’s python script nodes. This requires converting the categorical columns into Panda’s categorical data type.
df[cat_cols]= df[cat_cols].astype(‘category’)
This works without issue when running a standalone python script, but when running this python script from within one of Knime’s python script nodes it fails.
So I tested a simple scenario in the “Python Source” Node:
I can confirm that categorical data types can not be converted into a column in a KNIME table within the python script node.
There is a plan to address this. Current workaround would be converting categorical variables to simple types such as strings, integers, doubles before writing the dataframe to a KNIME table as shown below.
df['col2'] = df['col2'].astype('str')
We will update you here in this thread once this feature is available.
we have now implemented the support for categorical data in Python Script (Labs) and in pure-Python nodes. If you’re using the latest KNIME nightly you can get access to those features already, or wait for the KNIME 4.7 release in December.
Note that this does not include support in the non-Labs Python Script KNIME nodes.