Bug: Unsupported Operand for KNIMEPandasExtensionArray

Description

Attempting to calculate the difference between two dataTime columns in a Pandas dataFrame generates an unsupported operand type error.

This error was generated by a library (pm4py) imported into the KNIME scripting node, a trivial implementation without the library is shown below.

Note: I have the from_factorized patch on my version of KNIME which may be causing (or surfacing) this problem.

Trivial Case Demonstrating Problem

Screenshot 2023-04-03 185545

Python Timestamp Bug.knwf (22.2 KB)

The trivial case creates a KNIME table with two columns. The first column containing LocalDateTime timestamps, the second column containing LocalDateTime timestamps a random number of hours after the start.

Validation that the duration between the two timestamps can be calculated is done using the KNIME Date&Time difference node.

A Python Script node containing the following code generates the problem:

import knime.scripting.io as knio

df = knio.input_tables[0].to_pandas()

# Applying operations to dataFrames generates an unsupported operand error.
df["Difference"] = df["Start_Timestamp"] - df["End_Timestamp"]

knio.output_tables[0] = knio.Table.from_pandas(df)

The generated error is:

Executing the Python script failed: Traceback (most recent call last):
  File "<string>", line 6, in <module>
  File "D:\mambaforge\envs\knime_py\lib\site-packages\pandas\core\ops\common.py", line 70, in new_method
    return method(self, other)
  File "D:\mambaforge\envs\knime_py\lib\site-packages\pandas\core\arraylike.py", line 108, in __sub__
    return self._arith_method(other, operator.sub)
  File "D:\mambaforge\envs\knime_py\lib\site-packages\pandas\core\series.py", line 5639, in _arith_method
    return base.IndexOpsMixin._arith_method(self, other, op)
  File "D:\mambaforge\envs\knime_py\lib\site-packages\pandas\core\base.py", line 1295, in _arith_method
    result = ops.arithmetic_op(lvalues, rvalues, op)
  File "D:\mambaforge\envs\knime_py\lib\site-packages\pandas\core\ops\array_ops.py", line 216, in arithmetic_op
    res_values = op(left, right)
TypeError: unsupported operand type(s) for -: 'KnimePandasExtensionArray' and 'KnimePandasExtensionArray'

DiaAzul
LinkedIn | Medium | GitHub

Hi @DiaAzul,

Thank you so much for the detailed report and simple reproducing example!

We are aware of the missing operators and have a development ticket for that already (internal reference AP-19182). We hope to address this soon, but I cannot promise a fix date unfortunately :confused: Sorry about that.

In your example you could convert the columns from KnimePandasExtensionArray to non-KNIME types (e.g. using pandas.to_datetime), perform the subtraction, and then store the result in the table. But I guess for your pm4py usecase that would be more difficult?

Best,
Carsten

2 Likes

Thanks @carstenhaubold

The code is in the pm4py library and I don’t relish the idea of digging through it all to fix every little problem that comes up. Will wait for your ticket to get implemented.

Many thanks
DiaAzul

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.