standard deviation of a single value

niederle · April 8, 2009, 7:15pm

Hi there,

recently we found out that both nodes - “pivoting” and “group by” - gives a standard deviation value 0 if there is only one value available (in its group) or if there are multiple values but only one which is not a missing value.
Correct me if I’m wrong but in those cases the standard deviation should be a missing value and not 0!

Ciao,
Antje

aborg · April 9, 2009, 10:20am

Just a side note: I think having NaN there is also an option (as it is something like a division by 0).

niederle · April 9, 2009, 11:21am

are there NaN values supported so far? if yes, of course this would be even better!

thor · April 9, 2009, 2:56pm

Well, I’m not a professional statistician, but my school mathematics knowledge tells me, that the standard deviation for just one value if perfectly defined as zero. It is the square root of the expected value of the squared difference between all possible values and the mean value. If only one value is present, the mean is this value, the difference is 0, the expected value is zero and the square root of zero is zero.
Besides, NaN is supported, as it is just a double value.

aborg · April 9, 2009, 3:12pm

Well, you are right… if the standard deviation of the population is computed. If the standard deviation from sample, it is divided by (n-1), which in this case 0. I assumed Antje was expecting the latter behaviour.

niederle · April 20, 2009, 8:48am

Hi there,

I fully agree - I was expecting the (n-1)-version. And it seems to be this way of calculation in Knime. If I simply apply it to a small data set and compare the result with both versions calculated manually, I get a comparable value from the (n-1)-method.

Antje