recently we found out that both nodes - “pivoting” and “group by” - gives a standard deviation value 0 if there is only one value available (in its group) or if there are multiple values but only one which is not a missing value.
Correct me if I’m wrong but in those cases the standard deviation should be a missing value and not 0!
Just a side note: I think having
NaN there is also an option (as it is something like a division by
are there NaN values supported so far? if yes, of course this would be even better!
Well, I’m not a professional statistician, but my school mathematics knowledge tells me, that the standard deviation for just one value if perfectly defined as zero. It is the square root of the expected value of the squared difference between all possible values and the mean value. If only one value is present, the mean is this value, the difference is 0, the expected value is zero and the square root of zero is zero.
Besides, NaN is supported, as it is just a double value.
Well, you are right… if the standard deviation of the population is computed. If the standard deviation from sample, it is divided by
(n-1), which in this case
0. I assumed Antje was expecting the latter behaviour.
I fully agree - I was expecting the (n-1)-version. And it seems to be this way of calculation in Knime. If I simply apply it to a small data set and compare the result with both versions calculated manually, I get a comparable value from the (n-1)-method.