Problem: PCA node computes all projections regardless of settings

In addition to the title of the topic, I'd say that the node works correctly if I change its settings ("Dimensions to reduce to") and rerun it. However, if some upstream nodes get changed, and the whole workflow gets rerun, PCA node computes all projections again. In case of large datasets it can take indefinite amount of time. That said, another thing, which maybe deserves a separate topic, is it's not possible to cancel the execution of PCA node. The only way to stop it is to kill the KNIME process.


which PCA node are you using, the "just" PCA one?  I just tried and I could cancel it in any state, okay I honestly tried at 3 stages.

I don't fully understand when and why the node was ignoring its settings? Could you explain me how to set this up?

Best, Iris 

Hi Iris,

Sorry for the late reply, I didn't get any notification from the forum.

Yes, I use just "PCA". I've realized that the node is sort of waiting for a confirmation of the input data, if I start processing another dataset. So once all the nodes before "PCA" are re-executed with new input, the node shows an exclamation mark as if it's not configured or something like this. If I open it's configuration dialog and click "OK", the exclamation mark disappears and the following execution goes as expected.

So the node somehow gets reset after upstream nodes change and its settings have to be confirmed. It'd be much more convenient if it just applied previous settings. In my workflow I have "PCA" encapsulated in a couple of metanodes, which makes the situation a bit more annoying.

Regarding the cancelling PCA execution, usually it hangs on my PC if I have a table with a few thousand columns and a few dozens of rows. It would be ok for calculating just first 3 principal components, but if it starts calculating all of them, it takes a while. I can send you such a table, if you don't manage to reproduce it on your side.