Object datatype of a pandas dataframe makes python source node failing. Workaround: manually store the object type as string, but this is inconvenient

Hey all,
after reading about this bug here, i think there is still one with the datatype ‘object’ of a dataframe. : Python node failing - No serializer extension having the id or processing python type "int"

Error Message after executing the python source node

Execute failed: No serializer extension having the id or processing python type “_frozen_importlib_external.FileFinder” could be found.
Unsupported column type in column: “module_finder”, column type: “<class ‘_frozen_importlib_external.FileFinder’>”

Python Script to list down python packages in a pandas dataframe

Outcommenting the line that overwrites the datatype object to a string is a workaround to make the node working, However this is not super convenient. It would be nice if all datatypes of dataframes could be parsed into a knime table.

import pandas as pd
import pkgutil

## Empty list
data = []

##loop through all packages and store them in list

for pkg in pkgutil.iter_modules():
	data.append(pkg)
	
## list to dataframe
df = pd.DataFrame(data)

##overwrite object type to string - only then it works with knime
df = df.astype({'module_finder': pd.StringDtype(),'name': pd.StringDtype()})

output_table = df

Hi enr0c,

thanks for pointing that out! You are saying, that

It would be nice if all datatypes of dataframes could be parsed into a knime table.

And yes, there is a current mismatch between how Pandas and how KNIME handles an object.
We will have a look.

For now, you could also convert using the primitive types of Python
df = df.astype({'module_finder': str, 'name': str})
or if the bool type of the third column is not important, just
df = dfastype(str)

Best regards
Steffen

Hi enr0c,

so KNIME has a look if it knows the data type and if so, it converts the data. However, if we would just try to convert everything to string, chances are very high that we loose information and we do not know which. I think it is also good if the script writer knows of which type they want the data to be. We will not pursue that further for now, sorry.

Best regards
Steffen

1 Like