Error when using Pivoting nodes that execute parallely

MSchmid · February 10, 2011, 4:03pm

I encountered a problem when designing a workflow that uses a single data source and then spreads into three parallel execution flows. Each of the flows first filters the columns of the data table according to individual rules and then each flow performs a pivoting operation on the remaining data. As long as the three pivoting nodes are executed manually, everything works as expected. When executed in parallel (e.g. by executing all executable workflow nodes), usually at least two of the three pivoting nodes fail, giving the following error:

ERROR Pivoting Execute failed: Encountered duplicate row ID <<<value>>>

where <<<value>>> is a value of the grouping column, which is the same for all pivoting operations.

To me this looks like the different nodes mix up temporary data. I tested with knime 2.3.0 and 2.3.1 but did not notice any difference.

gabriel · February 14, 2011, 3:25pm

Hi MSchmid,

I am able to reproduce this problem with a single Pivot node given the following dataset: the group column is a mix of string and int values containing duplicates which results in an unknown column type (indicated by a '?'). Can you please check if this is the case? Can you please also check, if there are any additional exceptions or errors in the KNIME Console (log-level DEBUG adjustable in the KNIME preferences). I am happy to help to get this fixed.

Regards, Thomas

MSchmid · February 17, 2011, 12:51pm

My group column is a Date column containing duplicates. There are no additional exceptions or errors displayed, using log level DEBUG. To support the root cause analysis I could offer to send you the workflow with a sample data set.

gabriel · February 17, 2011, 5:10pm

We can confirm that this a bug in KNIME when using data/time cells as group column in the Pivot node; we are currently looking into it. The only workaround is to execute the Pivot node sub-sequentially, manually or for example forced by the variable ports, or to convert the data/time column into a string column.

mz.nozary · January 14, 2016, 12:39pm

Hello MSchmid

Could you please send me your sample workflow for root cause analysis?

I need to detect a causal relationship from a dataset and then use it for new observation and determine the root cause of failures.

Thank you in advance,

Mohammad