Bonjour,

I should like trying to apply the χ2 test by contingency table in the case of the die play. Rolling a die 600 times in a row gave the following results: 88 109 107 94 105 97 for respectely 1 2 3 4 5 6. Ideally, the expected value for each side should be 100. The number of degrees of freedom is 6 - 1 = 5. We wish to test the hypothesis that the die is not rigged, with a risk α = 0.05. The null hypothesis here is therefore: “The die is balanced”.

We can perform the Chi-square calculation by hand. Considering this hypothesis to be true, the variable Chi2 defined above is : ( 88 - 100 )^2/100 + ( 109 - 100 )^2/100 + ( 107 - 100 )^2/100 + ( 94 - 100 )^2/100 + ( 105 - 100 )^2/100 + ( 97 - 100 )^2/100 = 3 , 44 The χ2 distribution with five degrees of freedom gives the value below which we consider the draw to be compliant with a risk α = 0.05: P(Khi2 < 11.07) = 0.95. Since 3.44 < 11.07, we cannot reject the null hypothesis: this statistical data does not allow us to consider that the die is rigged.

Below, my Pandas code to try to find this result with chi2_contengency.

import pandas as pd

from scipy.stats import chi2_contingency

dico = {‘face’ : [1 ,2, 3, 4, 5, 6], ‘effectifs’ : [88, 109, 107, 94, 105, 97]}

tab = pd.DataFrame(dico)

print(“tab”,tab.head(6))

ta = pd.crosstab(tab[‘face’],tab[‘effectifs’])

print("ta = ",ta)

test = chi2_contingency(tab)

print(“table de chi2_contingency =”, test)

That produces:

tab face effectifs

0 1 88

1 2 109

2 3 107

3 4 94

4 5 105

5 6 97

ta = effectifs 88 94 97 105 107 109

face

1 1 0 0 0 0 0

2 0 0 0 0 0 1

3 0 0 0 0 1 0

4 0 1 0 0 0 0

5 0 0 0 1 0 0

6 0 0 1 0 0 0

table de chi2_contingency = (4.86, 0.432, 5)

Where chi2 is 4.86, p-value is 0.432 and 5 degrees of freedom.

That do not produce Chi2 = 3.44 as expected. Something is wrong.

Perhaps the table should be presented in a different way, for example with columns with the error or the expected value.

I don’t know. Do you have an idea?

Regards,

Atapalou