A Post Hoc Analysis for Pearson’s Chi-Squared Test for Count Data

Daniel Ebbert

2019-10-21

Introduction

When computing Pearson’s Chi-squared Test for Count Data the only result you get is that you know that there is a significant difference in the data and not which parts of the data are responsible for this. Here you see the example from the chisq.test documentation.

M <- as.table(rbind(c(762, 327, 468), c(484, 239, 477)))
dimnames(M) <- list(gender = c("F", "M"),
                    party = c("Democrat","Independent", "Republican"))
chisq.test(M)
#> 
#>  Pearson's Chi-squared test
#> 
#> data:  M
#> X-squared = 30.07, df = 2, p-value = 2.954e-07

Standarized residuals

As a form of post hoc analysis the standarized residuals can be analysed. A rule of thumb is that standarized residuals of above two show significance.

chisq.results <- chisq.test(M)
chisq.results$stdres
#>       party
#> gender   Democrat Independent Republican
#>      F  4.5020535   0.6994517 -5.3159455
#>      M -4.5020535  -0.6994517  5.3159455

Post Hoc Analysis

However, the above two rule is a rule of thumb. These standarized residuals can be used to calculate p-values, which is what this package is designed for as shown in the following example.

chisq.posthoc.test(M,
                   method = "bonferroni")
#>   Dimension     Value  Democrat Independent Republican
#> 1         F Residuals  4.502054   0.6994517  -5.315946
#> 2         F  p values  0.000040   1.0000000   0.000001
#> 3         M Residuals -4.502054  -0.6994517   5.315946
#> 4         M  p values  0.000040   1.0000000   0.000001