Error when using Pivoting nodes that execute parallely

I encountered a problem when designing a workflow that uses a single data source and then spreads into three parallel execution flows. Each of the flows first filters the columns of the data table according to individual rules and then each flow performs a pivoting operation on the remaining data. As long as the three pivoting nodes are executed manually, everything works as expected. When executed in parallel (e.g. by executing all executable workflow nodes), usually at least two of the three pivoting nodes fail, giving the following error:

ERROR     Pivoting     Execute failed: Encountered duplicate row ID <<<value>>>

where <<<value>>> is a value of the grouping column, which is the same for all pivoting operations.

To me this looks like the different nodes mix up temporary data. I tested with knime 2.3.0 and 2.3.1 but did not notice any difference.

Hi MSchmid,

I am able to reproduce this problem with a single Pivot node given the following dataset: the group column is a mix of string and int values containing duplicates which results in an unknown column type (indicated by a '?'). Can you please check if this is the case? Can you please also check, if there are any additional exceptions or errors in the KNIME Console (log-level DEBUG adjustable in the KNIME preferences). I am happy to help to get this fixed.

Regards, Thomas

My group column is a Date column containing duplicates. There are no additional exceptions or errors displayed, using log level DEBUG. To support the root cause analysis I could offer to send you the workflow with a sample data set.

We can confirm that this a bug in KNIME when using data/time cells as group column in the Pivot node; we are currently looking into it. The only workaround is to execute the Pivot node sub-sequentially, manually or for example forced by the variable ports, or to convert the data/time column into a string column.

Hello MSchmid

Could you please send me your sample workflow for root cause analysis?

I need to detect a causal relationship from a dataset and then use it for new observation and determine the root cause of failures.

Thank you in advance,

Mohammad