I have a really strange error and I was not able to fix it till now. A have a python script, which takes a dataframe and returns a pandas dataframe with 3 columns. Each element in every column is a list itself. The script worked totally fine, but know I’m getting a completely random error:
I know this kind of error and know what it means, but it makes no sense, because the script is running completely fine when I click on execute in the node.
Furthermore if I just swap my return df final_df with the input_table df_invoices there is no error anymore. Which makes no sense, because the script is doing essentially the same but is just returning the old not altered dataframe. And what does this have to do with the error above? Why is there even an error outside of the node but not inside?
Can someone please help me.
EDIT: If I comment everything out and just return a pandas DF with numbers I get the following error:
The error seems to happen on the “way back” from Python to KNIME, that is, outside of your script, when KNIME is converting the pandas data frame into a KNIME table again. This conversion only happens when you actually execute the node, not in the configuration dialogue (because in the dialogue, all data stays inside Python and no actual KNIME table is created).
Errors like the one you experience can happen if KNIME fails to find a proper equivalent KNIME-table representation of an output pandas data frame. That is because the format of a KNIME table is more restrictive than that of a data frame, so not every valid data frame can be converted into a valid KNIME table.
It is hard to tell whether that is also the underlying problem in your case. Could you share your knime.log file or check for yourself if the log contains any Python tracebacks that describe the error in more detail (e.g. point to specific lines in the code that performs the conversion)?
numpy arrays as data frame elements are (currently) not supported by KNIME because KNIME does not have any builtin n-dimensional array types.
Still, we should definitely produce more helpful error messages in such cases + potentially treat numpy arrays that contain only a single element or are 1-dimensional separately (the latter should be representable in KNIME in the form of collection cells).
thank you very much. That solved my problem. I wondered why this error never happened before, but I always used lists, but for this use-case I needed the numpy array urgently so therefore I just converted it back to a list before returning it. Thanks!
The error message with the “.any() or .all()” was weird because it leaded me to one hour of testing a list comprehension with an “and” which was ultimately completely fine.
Great! – Yes, we should probably do a better job of pointing out whether an error happens inside or outside the user-provided script. I am going to open an internal ticket for this and the improvements mentioned in the post above.
@ArminFan for what it is worth: you could still combine KNIME and Python and numpy arrays, just not use them inside KNIME but save them separately and re-use them whenever needed: