I’ve tried to do a chi-square test using crosstab node. When I do it, I get a p-value that is very low (1.74E-21), and it seems too good, compared to the results I can see in the cross-table. How can this be? Is it because I need to use a smaller dataset, or does it make sense that the p-value is this low?
I am trying to find out whether or not there is a correlation between whether or not a booking has been made in long or short time in advance and if that booking gets cancelled or not (cancelled = yes/no).
With chi-square you test if the distribution of the observed fequencies differs from the expected frequencies (which is the case in this example). But chi-square doesn’t indicate if there is some kind of correlation. For this you need a measure like Phi https://en.wikipedia.org/wiki/Phi_coefficient to see to what extend there is any correlation.