How does KNIME realise p^percentile and quantile in the groupby node?

Hello, i am new to the forum and wandering how does KNIME realise p^percentile and quantile in groupby node?

As far as I read, ways for manipulating the indexes of the number list may include:

  • Interquartile method`

  • The nearest-rank method`

  • The linear interpolation between closest ranks method`

  • The weighted percentile method`

  • and for excel users like me—Microsoft Excel method (?)

I am expecting slightly differences bettween those, so can someone explain how does KNIME realise p^percentile and quantile in the groupby node? or is there any documentations that I can refering to?

Many thanks

Welcome to the forum @ashleychendong.

The Description tab of the configuration dialog of the GroupBy node says that percentiles are calculated using the P^2 algorithm.

Is that what you’re looking for?

Thank you for replying elsamuel~!
I looked up that paper as well, but i still fail to understand what’s the key differences between quantile and p^percentile in the groupby node?

Hi @ashleychendong !!

In addition to previous answer from @elsamuel i would like to add that P^2 is an heuristic, which means an approximated method to compute the value with a “very small and fixed storage requirement regardless of the number of observations” as is annotated in the paper.

On the other hand, the default method used in Qualtile’s Knime aproach, is a deterministic way, computing the value from all data and also requiring more computational resources (even Quantile calculation in Knime has more options to compute the value using heuristics).

Finally, if you want to find Excel’s method equivalent in Knime, i suggest you to use quantile approach, based in the above about heuristic vs deterministic and also in this short example (be carefully to copy Knime result values to excel, given that Knime’s default tables in view options round large values to enhance the visual).

Quantile Forum.knwf (14.2 KB)

Regards,

4 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.