Standard Deviation - Group By

Felipereis50 · June 17, 2024, 4:25pm

Hi Friends

I have these values

3690.9
3699.0
3708.42
3745.28
3755.2
3964.27
4178.79

And I’m trying to calculate de Standard Deviation
I’m not a Statistician, but my value differs from searching on the internet.

I had used Excel and the Excel gives me a value of 170,33
And chatgpt also gives me 170,00

But Knime, in group by node, gives me 183,97

Why is different?

HansS · June 17, 2024, 7:33pm

Hi @Felipereis50

I am not a statistician, but if I calculate the standard deviation step by step, I see where the difference arises. standard_deviation.knwf (26.8 KB)

The last step is the sum of the squared deviation between the value given in column1 minus the mean of column1, divided by the number of observations (n), or the number of observations minus 1 (n-1).

dividing by n => st. deviation = 170.329
dividing by n-1 => st. deviation = 183,967

What situation is correct. I can’t tell. Read more in this article from www.khanacademy.org.

gr. Hans

yogesh_nawale · June 19, 2024, 9:14am

Hello @Felipereis50,

There are two types of Standard Deviation
1. Population Standard Deviation

The population standard deviation, the standard definition of σ, is used when an entire population can be measured, and is the square root of the variance of a given data set.

population standard deviation equation

Where

xi is an individual value
μ is the mean/expected value
N is the total number of values

2. Sample Standard Deviation

In many cases, it is not possible to sample every member within a population, requiring that the above equation be modified so that the standard deviation can be measured through a random sample of the population being studied.

sample standard deviation equation

Where

xi is one sample value
x̄ is the sample mean
N is the sample size

I used inbuilt functions in excel to calculate using both ways
Sample Standard Deviation gave me answer= 183.9762791
Population Standard Deviation gave me answer=170.328937

KNIME by default uses Sample Standard Deviation in both Statistics node and Math Formula Node

Regards
Yogesh

Felipereis50 · June 19, 2024, 2:12pm

@HansS
@yogesh_nawale

Now I understand

Thank you both for the analisys.
Help me a lot guys. Good work.

system · June 26, 2024, 2:13pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.