I’m trying to find out whether the Kurtosis value calculated in the Statistics and GroupBy Nodes is the standard Kurtosis value, excess Kurtosis (kurtosis - 3) or an alternative form.
The description info for the Statistics node has no guidance on this at all, while the description tab in GroupBy node’s configuration window contains:-
“Calculates the kurtosis per group. Attention: calculation is bias-corrected and at least four values per group are required. If the latter does not hold, a missing cell is returned.”
It’s not clear what type of bias-correction is used and if this correction is on standard or excess kurtosis or is in fact the calculation for the latter.
From my testing I don’t believe the value can be standard kurtosis as I can get negative values (standard kurtosis should always be positive) so the output could be excess kurtosis or calculated by an alternative method.
Can anybody provide a definitive answer on this?
In general it would be good to capture this level of detail in node descriptions so users fully understand how statistics/values are calculated by a particular node.
As much as I love KNIME, currently I find it difficult to go out there and advocate for it as effectively as I would like when I can’t defend it from accusations of it being a black box solution because of these kinds of issues. I would be happy to contribute to the updating of node descriptions to help improve this situation.
Many thanks for your help