Introduction
Let x denote the number of successes in n independent Bernoulli trials with X ~ Binomial (n, p) then \(\hat p = x/n\) denotes the sample proportion. Single binomial proportion (p) has drawn appreciable research attention with theoretical, applied, and pedagogic objectives.
This package has identified scope to collate widely or frequently used methods involved in the inferential problems regarding p and prominent procedures for comparing in terms of their performance. This includes two major statistical paradigms, Classical and Bayesian; especially, later provides a list of tools to broaden this scope.
Objective of this package is to present interval estimation procedures for ‘p’ outlined above in a more comprehensive way. Performance assessment of these procedures such as coverage probability, Expected length, Error, p-confidence and p-bias are included. Also, an array of Bayesian computations (Bayes factor, Empirical Bayesian, Posterior predictive computation, and posterior probability) with conjugate prior is made available. More importantly package has aimed to complement the summaries using more appropriate graphical forms that enhance the presentation and teaching activities.
Additional functions
Additional functions
hypotestBAF1 |
covpGEN |
lengthGEN |
pCOpBIGEN |
empericalBA |
errGEN |
hypotestBAF1x |
PlotcovpGEN |
PlotlengthGEN |
PlotpCOpBIGEN |
empericalBAx |
|
hypotestBAF2x |
covpSIM |
lengthSIM |
|
probPOSx |
|
hypotestBAF2 |
PlotcovpSIM |
PlotlengthSIM |
|
probPOS |
|
hypotestBAF3x |
|
PlotexplGEN |
|
probPREx |
|
hypotestBAF3 |
|
PlotexplSIM |
|
probPRE |
|
hypotestBAF4x |
|
|
|
|
|
hypotestBAF4 |
|
|
|
|
|
hypotestBAF5x |
|
|
|
|
|
hypotestBAF5 |
|
|
|
|
|
hypotestBAF6x |
|
|
|
|
|
hypotestBAF6 |
|
|
|
|
|
9. Testing of hypothesis and other functions for Bayesian method
- EBA:
Highest Probability Density (HPD) and two tailed intervals are provided for all \(x = 0, 1, 2 ..n\) based on empirical Bayesian approach for Beta-Binomial model. Lower and Upper support values are needed to obtain the MLE of marginal likelihood for prior parameters.
- probPRE:
Computes posterior predictive probabilities for the required size of number of trials (\(m\)) from the given number of trials (n) for the given parameters for Beta prior distribution
- hypotestBAF1:
Computes Bayes factor under Beta-Binomial model for the model: \(H_0: p = p_0\) Vs \(H_A: p \ne p_0\) from the given number of trials n and for all number of successes \(x = 0, 1, 2......n\)
- hypotestBAF2:
Computes Bayes factor under Beta-Binomial model for the model: \(H_0: p = p_0\) Vs \(H_A: p > p_0\) from the given number of trials n and for all number of successes \(x = 0, 1, 2......n\)
- hypotestBAF3:
Computes Bayes factor under Beta-Binomial model for the model: \(H_0: p = p_0\) Vs \(H_A: p < p_0\) from the given number of trials n and for all number of successes \(x = 0, 1, 2......n\)
- hypotestBAF4:
Computes Bayes factor under Beta-Binomial model for the model: \(H_0: p \le p_0\) Vs \(H_A: p > p_0\) from the given number of trials n and for all number of successes \(x = 0, 1, 2......n\)
- hypotestBAF5:
Computes Bayes factor under Beta-Binomial model for the model: \(H_0: p \ge p_0\) Vs \(H_A: p < p_0\) from the given number of trials n and for all number of successes \(x = 0, 1, 2......n\)
- hypotestBAF6:
Computes Bayes factor under Beta-Binomial model for the model: \(H_0: p < p_1\) Vs \(H_A: p > p_2\) from the given number of trials n and for all number of successes \(x = 0, 1, 2......n\)
- probPOS:
Computes probability of the event \(p < p_0\) (\(p_0\) is specified in \([0, 1]\)) based on posterior distribution of Beta-Binomial model with given parameters for prior Beta distribution for all \(x = 0, 1, 2...n\) (\(n\): number of trials)
10. Assistance for reading papers
We have taken six key papers and shown how this package can assist in reproducing the results in these papers. On top of that we have also provided some further areas researchers can gain insight using the package.
Additional functions
1 |
20 |
0 |
Newcombe |
Wald ,Score,(both with |
ciWDx,ciSCx |
Methods such as Bayesian,Arcsine, |
2 |
29 |
1 |
|
and without CC) Exact |
ciCWDx, |
Logit Wald methods; Numerical |
3 |
148 |
15 |
|
and LR for CI |
ciCSCx, |
& graphical comparisons of methods |
4 |
263 |
81 |
|
|
ciEXx,ciLRx |
Use of general CC and adj. factor |
5 |
10 |
10 |
Joseph 2005 |
Wald and Exact CI |
ciWDx,ciEXx |
Bayes factor |
6 |
98 |
100 |
|
|
|
|
7 |
17 |
16 |
Zhou 2008 |
Wald, Score, |
ciWDx, ciSCx |
Other methods such as Bayesian, |
8 |
14 |
12 |
|
Agresti-Coull & |
ciAWDx |
Arcsine Logit transformed methods |
|
|
|
|
modified logit for CI |
|
Use of general CC and adj. factor |
9 |
167 |
0 |
Wei 2012 |
Score, Agresti-Coull |
ciSCx, |
Other classical methods; Numerical |
|
|
|
|
Bayesian(Jeffreys prior) |
ciAWDx, |
& graphical comparisons of methods |
|
|
|
|
& other two methods |
ciBAx |
Use of general CC and adj. factor |
10 |
109 |
16 |
Tuyl 2008 |
Bayesian method with |
ciBAx |
Other classical methods; Numerical |
|
|
|
|
five different beta priors |
|
& graphical comparisons of methods |
|
|
|
|
|
|
Use of general CC and adj. factor |
11 |
NA |
10 |
Vos 2005 |
p-confidence, p-bias |
pCOpBIBA |
|
Paper 1 (Newcombe 1998):
The paper has compared seven methods (Wald, Wald continuity corrected, Likelihood ratio, Score (Wilson), Score, continuity corrected, Clopper Pearson, Mid-P) for Two-sided confidence intervals for the single proportion. Evaluation criteria, Average CP, Aberrations, Zero Width Interval and Non Coverage aspects are considered. Four illustrative data sets have also been provided The package, proportion provides a more comprehensive way of summarizing results similar to the above studies; for example, a function \((ciAllx for n = 20, x = 0)\) from the package yields an easily comparable summary (numerical and graphical) together with other useful measures like existence of aberration, zero width intervals (ZWI). ArcSine and Wald-t methods are additional inclusions; Summaries / Methods which are not readily available elsewhere such as opting with Exact method in a more general way (ciEXx), continuity corrected (ciCAllx), or adding pseudo constants (ciAAllx) in a more general way or Quantile (Q) based and Highest Posterior (H) based CI from Bayesian conjugate method (ciBAx) with an option for specifying any plausible value for the two parameters of prior beta distribution.
Numerical Summaries
Asymptotic methods CI using ciAllx(x=0,n=20,alp=0.05)
Wald |
0 |
0.0000000 |
0.0000000 |
NO |
NO |
YES |
ArcSine |
0 |
0.0472546 |
0.0472546 |
NO |
NO |
YES |
Likelihood |
0 |
0.0000253 |
0.0916153 |
NO |
NO |
NO |
Score |
0 |
0.0000000 |
0.1611252 |
NO |
NO |
NO |
Logit-Wald |
0 |
0.0000000 |
0.1684335 |
NO |
NO |
NO |
Wald-T |
0 |
0.0000000 |
0.2440055 |
YES |
NO |
NO |
Exact method CI using ciBAx(x=0,n=20,alp=0.05,e=c(0.1,0.5,0.95,1))
0 |
0 |
0.0669670 |
NO |
NO |
NO |
0.10 |
0 |
0 |
0.1391083 |
NO |
NO |
NO |
0.50 |
0 |
0 |
0.1662980 |
NO |
NO |
NO |
0.95 |
0 |
0 |
0.1684335 |
NO |
NO |
NO |
1.00 |
Bayesian CI using ciBAx() with x=0,n=20,alp=0.05, varying a(2,1,0.05,0.02 and b(2,1,0.05,2)
Assuming Symmetry |
0 |
0.0107100 |
0.2194866 |
0.0023218 |
0.1913698 |
Flat |
0 |
0.0012049 |
0.1610976 |
0.0000000 |
0.1329459 |
Jeffreys |
0 |
0.0000242 |
0.1166390 |
0.0000000 |
0.0904764 |
Near boundary |
0 |
0.0000000 |
0.0089203 |
0.0000000 |
0.0021319 |
Adding Pseudo constant using ciAAllx(x=0,n=20,alp=0.05,h=2)
Adj-Wald |
0 |
0.0000000 |
0.1939085 |
YES |
NO |
NO |
Adj-ArcSine |
0 |
0.0085880 |
0.2238858 |
NO |
NO |
NO |
Adj-Liklihood |
0 |
0.0143776 |
0.2357444 |
NO |
NO |
NO |
Adj-Score |
0 |
0.0231588 |
0.2584880 |
NO |
NO |
NO |
Adj-Logit Wald |
0 |
0.0209299 |
0.2788112 |
NO |
NO |
NO |
Adj-Wald-T |
0 |
0.0000000 |
0.2231950 |
YES |
NO |
NO |
Adding Continuity Correction, c = 1/(2n) & using ciCAllx(x=0,n=20,alp=0.05,c=1/40)
Wald |
0 |
0.0000000 |
0.0250000 |
YES |
NO |
NO |
ArcSine |
0 |
0.0584251 |
0.0584251 |
NO |
NO |
YES |
Score |
0 |
0.0045747 |
0.2004533 |
NO |
NO |
NO |
Logit Wald |
0 |
0.0000000 |
0.1684335 |
NO |
NO |
NO |
Wald-T |
0 |
0.0000000 |
0.2690055 |
YES |
NO |
NO |
Graphical Summaries
# 17. Paper 1
PlotciAllx(x=0,n=20,alp=0.05)

Corresponding comparison for sum of length of CI can be obtained as below
# 18. Plot of sum of length of exact method
PlotlengthEX(n=10,alp=0.05,e=c(0.1,0.5,0.95,1),a=1,b=1)

In the case of other evaluation criteria, package proportion provides ample scope for comparing competing methods. Following table and plots illustrate for n = 250 (inspired from n = 263) using the functions \(covpAll(n,alp,a1,b1)\) and \(PlotcovpAll(n,alp,a1,b1)\)
Coverage probability using covpAll()
Wald |
0.9364816 |
0.0008363 |
0.0634224 |
0.0619649 |
0.9376949 |
91.06 |
ArcSine |
0.9411163 |
0.0176970 |
0.0604679 |
0.0598118 |
0.9253544 |
95.48 |
Lilelihood |
0.9489788 |
0.0000000 |
0.0160055 |
0.0159729 |
0.9491132 |
96.72 |
Score |
0.9506135 |
0.8516555 |
0.0056199 |
0.0055863 |
0.0991156 |
98.86 |
WaldLogit |
0.9521187 |
0.8859012 |
0.0062506 |
0.0058806 |
0.0664781 |
98.02 |
Wald-T |
0.9497880 |
0.9237747 |
0.0114848 |
0.0114829 |
0.0284351 |
94.70 |

For a more comparative case consider similar plot for \(n = 10\)
# 20. Paper 1
PlotcovpAll(n=10,alp=0.05,a=1,b=1,t1=0.93,t2=0.97)

Paper 2 (Joseph and Reinfold 2005):
A tutorial kind of article pertaining to obtain CI based on inverting two tailed tests involving single proportion is available in Joseph and Reinfold 2005. This mainly deals with Wald large sample and Exact methods for CI and hypothesis testing involving values near boundary of p. Only interval for Wald is avialble in the paper however, comparison of procedure would enhance the presentation and purpose. One way is through a pictorial output can be improved further by sorting the CI for each \(x = 0, 1,..,n\) using a function \(PlotciAllg(n,alp)\).
# 21. Paper 2 - display the function
PlotciAllg(n=10,alp=0.05)

Further, one of the significant features of the package is readily available Bayesian testing alternatives involving single binomial proportion (p). For example data from this paper involves a classical testing \(H_0: p \le 0.9\) vs. \(H_1: p > 0.9\), Bayes factor can be calculated using the function \(hypotestBAF4(x,n,th0,a0,b0,a1,b1)\). (Six functions are available for the exhaustive possibilities of testing hypotheses on p). Numerical result for this data under the assumption that uniform and Jeffreys prior for null and alternate models respectively is 0.0832, which is evident to reject \(H_0\).
Additionally this package has an option (like hypotestBAF4) to compare Bayes factor for all possible values of x (such as the one listed below) so as to understand the possible change in the values of Bayes factor in turn the decision.
# 20. Paper 2 - display the function
hypotestBAF4(n=10, th0=0.9, a0=1,b0=1,a1=0.5,b1=0.5)
Hypothesis test, H0: p <= 0.9 vs. H1: p > 0.9
0 |
3.987416e+10 |
Evidence against H1 is very strong |
1 |
2.088687e+08 |
Evidence against H1 is very strong |
2 |
3.619818e+06 |
Evidence against H1 is very strong |
3 |
1.165063e+05 |
Evidence against H1 is very strong |
4 |
5.924426e+03 |
Evidence against H1 is very strong |
5 |
4.440244e+02 |
Evidence against H1 is very strong |
6 |
4.749817e+01 |
Evidence against H1 is strong |
7 |
7.123142e+00 |
Evidence against H1 is positive |
8 |
1.449043e+00 |
Evidence against H1 is not worth more than a bare mention |
9 |
3.619376e-01 |
Evidence against H0 is not worth more than a bare mention |
10 |
8.319770e-02 |
Evidence against H0 is positive |
Paper 3 (Zhou et al 2008) and 4 (Wei Yu et al 2012):
The main objective of Zhou et al 2008 is to improve logit Wald method and the method has been illustrated with \(x = 16\) and \(n = 17\). Similarly, Wei Yu et al 2012 have attempted an improvement for Score method with a real data example (\(x = 16, n = 109\)). Further, two adjustment methods can easily be compared with other adjustment methods using the options available from the package ciAAllx. Intentionally the adjustment factor (h) is taken as zero to compare with original results of respective studies. Such comparison is pervasive in a statistical investigation involving a parameter, particularly for p.
# 20. Function to evaluate ci varying the adding constant h
ciAAllx(x=16, n=17,alp = 0.05,h=0)
ciAAllx(x=16, n=109,alp = 0.05,h=0)
The full results are shown below with \(h\) values of 0,1 and 2.
CI with x=16, n=17 & h=0
Adj-Wald |
16 |
0.8293268 |
1.0000000 |
NO |
YES |
NO |
Adj-ArcSine |
16 |
0.7845775 |
0.9999467 |
NO |
NO |
NO |
Adj-Liklihood |
16 |
0.7656344 |
0.9965448 |
NO |
NO |
NO |
Adj-Score |
16 |
0.7301797 |
0.9895396 |
NO |
NO |
NO |
Adj-Logit Wald |
16 |
0.6796805 |
0.9917795 |
NO |
NO |
NO |
Adj-Wald-T |
16 |
0.7478459 |
1.0000000 |
NO |
YES |
NO |
CI with x=16, n=109 & h=0
Adj-Wald |
16 |
0.0803520 |
0.2132260 |
NO |
NO |
NO |
Adj-ArcSine |
16 |
0.0869474 |
0.2190422 |
NO |
NO |
NO |
Adj-Liklihood |
16 |
0.0888655 |
0.2211349 |
NO |
NO |
NO |
Adj-Score |
16 |
0.0924191 |
0.2252076 |
NO |
NO |
NO |
Adj-Logit Wald |
16 |
0.0919145 |
0.2262616 |
NO |
NO |
NO |
Adj-Wald-T |
16 |
0.0788708 |
0.2147072 |
NO |
NO |
NO |
CI with x=16, n=17 & h=1
Adj-Wald |
16 |
0.7567438 |
1.0000000 |
NO |
YES |
NO |
Adj-ArcSine |
16 |
0.7221104 |
0.9888902 |
NO |
NO |
NO |
Adj-Liklihood |
16 |
0.7090774 |
0.9816599 |
NO |
NO |
NO |
Adj-Score |
16 |
0.6860592 |
0.9706414 |
NO |
NO |
NO |
Adj-Logit Wald |
16 |
0.6626006 |
0.9735380 |
NO |
NO |
NO |
Adj-Wald-T |
16 |
0.7243814 |
1.0000000 |
NO |
YES |
NO |
CI with x=16, n=109 & h=1
Adj-Wald |
16 |
0.0861567 |
0.2201496 |
NO |
NO |
NO |
Adj-ArcSine |
16 |
0.0925269 |
0.2257484 |
NO |
NO |
NO |
Adj-Liklihood |
16 |
0.0943876 |
0.2277104 |
NO |
NO |
NO |
Adj-Score |
16 |
0.0978748 |
0.2316357 |
NO |
NO |
NO |
Adj-Logit Wald |
16 |
0.0973834 |
0.2326298 |
NO |
NO |
NO |
Adj-Wald-T |
16 |
0.0847926 |
0.2215137 |
NO |
NO |
NO |
CI with x=16, n=17 & h=2
Adj-Wald |
16 |
0.7074793 |
1.0000000 |
NO |
YES |
NO |
Adj-ArcSine |
16 |
0.6798301 |
0.9701145 |
NO |
NO |
NO |
Adj-Liklihood |
16 |
0.6701048 |
0.9624273 |
NO |
NO |
NO |
Adj-Score |
16 |
0.6536394 |
0.9501899 |
NO |
NO |
NO |
Adj-Logit Wald |
16 |
0.6386495 |
0.9532031 |
NO |
NO |
NO |
Adj-Wald-T |
16 |
0.6887962 |
1.0000000 |
NO |
YES |
NO |
CI with x=16, n=109 & h=2
Adj-Wald |
16 |
0.0918193 |
0.2267648 |
NO |
NO |
NO |
Adj-ArcSine |
16 |
0.0979757 |
0.2321580 |
NO |
NO |
NO |
Adj-Liklihood |
16 |
0.0998091 |
0.2340433 |
NO |
NO |
NO |
Adj-Score |
16 |
0.1032005 |
0.2377869 |
NO |
NO |
NO |
Adj-Logit Wald |
16 |
0.1027218 |
0.2387274 |
NO |
NO |
NO |
Adj-Wald-T |
16 |
0.0905595 |
0.2280246 |
NO |
NO |
NO |
To compare the length of the intervals for the data \(x = 16, n = 17\), a graphical form can be obtained from the package using
# 21. Paper 3&4 - Plot of all the adjusted CI with h=1
PlotciAAllxg(x=16,n=17,alp=0.05,h=1)

As can be seen above the grouping function (ending with g) convinently orders the results within each value of x.
Another aspect is the way Exact method has been handled; based on the extensive studies for adjusting Exact method, this package confines to randomized test using the constant e in [0, 1]. Example 6 (Joseph and Reinfold 2005- see table above) may be reproduced with the function ciEXx as shown below.
# 22. Paper 3&4 - display the function
ciEXx(x=98, n=100,alp = 0.05,e=c(0.1,.5,0.95,1))
CI-Exact with x=98, n=100
98 |
0.9429629 |
0.9946812 |
NO |
NO |
NO |
0.10 |
98 |
0.9355021 |
0.9966313 |
NO |
NO |
NO |
0.50 |
98 |
0.9300850 |
0.9975033 |
NO |
NO |
NO |
0.95 |
98 |
0.9295962 |
0.9975662 |
NO |
NO |
NO |
1.00 |
Paper 5 (Tuyl et al 2008):
This paper has compared difference non-informative priors with an informative prior based on an earlier study for single binomial proportion with a real data set \(x = 0, n = 167\). This is one of most often cited examples for zero successes which have witnessed active research. The predictive density based comparison has been carried out to emphasize a specific prior assumption. This package provides readily available options in Bayesian computation using posterior predictive distributions for a wider comparison and probabilities. A quick comparison using Uniform prior for zero successes or possibility for \(p = 0.5\) can be explored using the function \(probPREx(x,n,xnew,m,a1,a2)\). The variable xnew and m varies, keeping x=0, n=167, a1=a2=1.
Predicted probability with x=0, n=167 varying xnew and m
0 |
167 |
0 |
10 |
0.9438202 |
0 |
167 |
0 |
50 |
0.7706422 |
0 |
167 |
0 |
100 |
0.6268657 |
0 |
167 |
0 |
150 |
0.5283019 |
0 |
167 |
5 |
10 |
0.0000002 |
0 |
167 |
25 |
50 |
0.0000000 |
0 |
167 |
50 |
100 |
0.0000000 |
0 |
167 |
75 |
150 |
0.0000000 |
Assuming that the example depicts a rare event, an analysis with posterior probabilities would enhance the analysis when the function \(probPOSx(x,n,a,b,th)\) is used
Guidance for priors used below
Uniform prior |
1.000 |
1.00 |
Jeffreys prior |
0.500 |
0.50 |
Tuyl p1 |
0.042 |
27.96 |
Tuyl p2 |
1.000 |
666.00 |
Tuyl p3 |
1.000 |
398.00 |
Posterior probability with x=0, n=167 varying th
th=0.001 |
0.1547172 |
0.4370766 |
0.9479747 |
0.5654381 |
0.4318005 |
th=0.01 |
0.8151954 |
0.9332767 |
0.9976693 |
0.9997687 |
0.9965811 |
th=0.1 |
1.0000000 |
1.0000000 |
1.0000000 |
1.0000000 |
1.0000000 |
th=0.5 |
0.9998190 |
0.9999656 |
0.9999998 |
1.0000000 |
1.0000000 |
Also literature often compiles the frequentist evaluation criteria for Bayesian methods too and hence this package includes most prominent methods as well as other measures as a sign of enlarging the scope of comparison.
Paper 6 (Vos and Hudson 2005):
The p-confidence and p-Bias from Vos and Hudson (2005) and the result for p-confidence and p-bias for two types of Bayesian CI for \(n = 10\) using \(pCOpBIBA(n,alp,a1,a2)\) is
p-Confidence & p-Bias of Bayesian method for n=10, a1=a2=1
1 |
58.75446 |
33.416269 |
86.04498 |
1.7034458 |
2 |
76.33893 |
15.007198 |
86.58091 |
1.4986566 |
3 |
82.59405 |
7.832379 |
87.14676 |
1.0040862 |
4 |
85.83952 |
3.477491 |
87.57388 |
0.4968235 |
5 |
87.87691 |
0.000000 |
87.87691 |
0.0000000 |
6 |
85.83952 |
3.477491 |
87.57388 |
0.4968235 |
7 |
82.59405 |
7.832379 |
87.14676 |
1.0040859 |
8 |
76.33893 |
15.007198 |
86.58091 |
1.4986566 |
9 |
58.75446 |
33.416269 |
86.04498 |
1.7034458 |
References
[1] 1998 Newcombe RG. Two-sided confidence intervals for the single proportion: Comparison of seven methods. Statistics in Medicine: 17; 857 - 872.
[2] 2005 Joseph L and Reinhold C. Statistical Inference for Proportions American Journal of Radiologists 184; 1057 - 1064
[3] 2008 Zhou, X. H., Li, C.M. and Yang, Z. Improving interval estimation of binomial proportions. Phil. Trans. R. Soc. A, 366, 2405-2418
[4] 2012 Wei Yu, Xu Guo and Wangli Xua. An improved score interval with a modified midpoint for a binomial proportion, Journal of Statistical Computation and Simulation, 84, 5, 1-17