Hi I tried the new python labs node and have an issue when writing back
import knime_io as knio
import pyarrow as pa
# This example script simply outputs the node's input table.
print(type(knio.input_tables[0]))
df = knio.input_tables[0].to_pandas()
print(df.head())
df = df[['Cat','CustomerID']]
ar = pa.Table.from_pandas(df)
#does not work
knio.output_tables[0] = knio.write_table(ar)
#does work
#knio.output_tables[0] = knio.write_table(knio.input_tables[0])
In the Node both work but when I close the node and execute the workflow the upper throws me an error
Any ideas? It just tells me exception occurred in the python kernel. Thanks
@Daniel_Weikert could you give us more details for example what the error message says? Better even could you provide a workflow that would demonstrate the error?
Two possibilities would be date and time variables where there seem to be problems.
Or might there be an issue with RowID and index in pandas?
Thanks a lot for your comments @mlauber71 .
I made sure to remove any datetime columns and resetted the index in the pandas df.
It works smoothly as long as I have the node open. Only when running the workflow it fails.
Error is
Execute failed: An exception occured while running the Python kernel. See log for details.
but the KNIME log does not show anything? What kind of log is this message pointing to?
Are the new nodes a huge performance boost? Just wondering because the old node works fine with this “testing 2 line script”
br
ar = pa.Table.from_pandas(df)
knio.output_tables[0] = knio.write_table(ar)
and see if that works? KNIME performs some additional type handling when converting between Pandas and PyArrow internally, but when you do it manually KNIME cannot do that for you. So that could be the culprit.
A minimal workflow using the same data types (but maybe randomized data) as your real workflow would help us to reproduce and understand the problem better.
Thanks a lot @carstenhaubold
That worked. So it is never required to convert the dataframe back even it is converted to pandas after reading it? The output writer will always accept a pandas dataframe?
br
The dataframe hat several types including datime, float, category.
But the output I converted back to pyarrow where only to categorical columns and with those only it already failed
br