Introduction

Let x denote the number of successes in n independent Bernoulli trials with X ~ Binomial (n, p) then \(\hat p = x/n\) denotes the sample proportion. Single binomial proportion (p) has drawn appreciable research attention with theoretical, applied, and pedagogic objectives.

This package has identified scope to collate widely or frequently used methods involved in the inferential problems regarding p and prominent procedures for comparing in terms of their performance. This includes two major statistical paradigms, Classical and Bayesian; especially, later provides a list of tools to broaden this scope.

Objective of this package is to present interval estimation procedures for ‘p’ outlined above in a more comprehensive way. Performance assessment of these procedures such as coverage probability, Expected length, Error, p-confidence and p-bias are included. Also, an array of Bayesian computations (Bayes factor, Empirical Bayesian, Posterior predictive computation, and posterior probability) with conjugate prior is made available. More importantly package has aimed to complement the summaries using more appropriate graphical forms that enhance the presentation and teaching activities.

Workflow

Following Figure depicts the way this inferential problem can be understood so as to expand the scope of computations; bold face indicates modification of existing procedures or addition of new procedures such as t-distribution based Wald method that are not available for wider audience.

Inference on single Binomial Proportion (??)

Notations Used

  1. x: Number of successes
  2. n: Number of trials
  3. \(\alpha\): Level of significance
  4. e: Exact method indicator in [0, 1] {1: Clopper Pearson, 0.5: Mid P}. In all exact functions you can set a range of values between 0 and 1.
  5. a and b: Beta parameters for hypothetical parameter generation; Prior parameters in Bayesian predictive models
  6. t1 and t2: Limits for tolerance (within which CP lies)
  7. \(\pi\): Population parameter
  8. f: Failure limit
  9. h: Constant used in adjustment methods
  10. c: Constant used in continuity corrected methods
  11. \(a_1,a_2\): Prior parameters in Bayesian estimation procedures
  12. LL, UL: Lower and Upper limits for the intervals due to any other methods
  13. s: Number of simulations
  14. hp: Hypothetical parameter values
  15. sL, sU: Lower and Upper specification for hyper prior in Empirical Bayesian (EB) approach
  16. m, xnew: Number of trials and number of successes in Bayesian predictive models
  17. th0, th1, th2: Parameter values in the models \(M_0, M_1, M_2\) of Bayes factor
  18. \(a_j, b_j\): Prior parameters in the models \(M_j (j = 0, 1, 2)\) of Bayes factor
  19. th: Parameter value in Bayesian posterior probabilities

Naming comvention used in the package

Naming convention used in functions
Abbrivation Expansion
ci, ciA, ciC Confidence Interval, adjusted CI and continuity corrected CI
covp, covpA, covpC Coverage Probability, adjusted CP & continuity corrected CP
expl, explA, explC Expected Length, adjusted Expected Length & continuity corrected EL
length, lengthA, lengthC Sum of Length, adjusted Sum of Length & continuity corrected sumLen
pCOpBI, pCOpBIA, pCOpBIC p-Confidence & p-Bias, adjusted p-Conf & p-Bias
and continuity corrected p-Confidence & p-Bias
err, errA, errC Error, adjusted error and continuity corrected error
AS ArcSine
LR Likelihood Ratio
LT Logit Wald
SC Score
TW Wald-T
WD Wald
All 6 base methods - Wald, Wald-T, Logit Wald, ArcSine, LR, Score
AAll 6 adj methods - Wald, Wald-T, Logit Wald, ArcSine, LR, Score
CAll 5 cont. corr. methods - Wald, Wald-T, Logit Wald, ArcSine, Score
BA Bayesian
EX Exact - setting e=0.5 gives mid-p and e=1 gives Clopper-Pearson
Guide to identify core functions - Plot, Modifications and x are optional
Plot Concept Modifications Name Single x Sample combination Sample function
Plot ci A AS x ci + A + AS + x = ciAASx
covp C SC Plot + ci + A + AS + x = PlotciAASx
expl BA Plot + covp + C + SC = PlotcovpCSC
length EX expl + A + TW = explATW
pCOpBI TW expl + A + TW + x = explATWx
err LT length + WD = lenghtWD
WD length + A + WD = lengthAWD
LR length + C + WD = lengthCWD

Confidence Interval

Confidence Interval
Basic Basic-x Adj Adj-x CC CC-x
ArcSine ciAS ciASx ciAAS ciAASx ciCAS ciCASx
LR ciLR ciLRx ciALR ciALRx
Logit ciLT ciLTx ciALT ciALTx ciCLT ciCLTx
Score ciSC ciSCx ciASC ciASCx ciCSC ciCSCx
Wald-T ciTW ciTWx ciATW ciATWx ciCTW ciCTWx
Wald ciWD ciWDx ciAWD ciAWDx ciCWD ciCWDx
All ciAll ciAllx ciAAll ciAAllx ciCAll ciCAllx
Bayes ciBA ciBAx
Exact ciEX ciEXx
Plotting functions of CI
Basic Basic-x Adj Adj-x CC CC-x
ArcSine PlotciAS PlotciAAS PlotciCAS
LR PlotciLR PlotciALR
Logit PlotciLT PlotciALT PlotciCLT
Score PlotciSC PlotciASC PlotciCSC
Wald-T PlotciTW PlotciATW PlotciCTW
Wald PlotciWD PlotciAWD PlotciCWD
Allg PlotciAllg PlotciAllxg PlotciAAllg PlotciAAllxg PlotciCAllg PlotciCAllxg
All PlotciAll PlotciAllx PlotciAAll PlotciAAllx PlotciCAll PlotciCAllx
Bayes PlotciBA
Exact PlotciEX PlotciEXx

1. CONFIDENCE INTERVAL- BASE METHODS

  1. Wald:
    Wald-type interval that results from inverting large-sample test and evaluates standard errors at maximum likelihood estimates for all \(x = 0, 1, 2 ..n.\)
  2. Score:
    A score test approach based on inverting the test with standard error evaluated at the null hypothesis is due to Wilson for all \(x = 0, 1, 2 ..n.\)
  3. ArcSine:
    Wald-type interval for all \(x = 0, 1, 2 ..n.\) using the arcsine transformation of the parameter \(p\); that is based on the normal approximation for \(sin^{-1}(p)\)
  4. Logit Wald:
    Wald-type interval for all \(x = 0, 1, 2 ..n.\) based on the logit transformation of \(p\); that is that is normal approximation for \(log\frac{p}{1-p}\)
  5. Wald-t:
    An approximate method based on a t_approximation of the standardized point estimator for all \(x = 0, 1, 2 ..n.\); that is the point estimator divided by its estimated standard error. Essential boundary modification is when \(x = 0\) or \(n\), \(\hat p =\frac{x+2}{n+4}\)
  6. Likelihood Ratio:
    Likelihood ratio limits for all \(x = 0, 1, 2 ..n.\) obtained as the solution to the equation in \(p\) formed as logarithm of ratio between binomial likelihood at sample proportion and that of over all possible parameters
  7. Exact:
    Confidence interval for \(p\) (for all \(x = 0, 1, 2 ..n.\)), based on inverting equal-tailed binomial tests with null hypothesis \(H_0: p = p_0\) and calculated from the cumulative binomial distribution. Exact two sided P-value is usually calculated as \(P= 2[ePr(X = x) + min{Pr(X < x), Pr(X > x)}]\) where probabilities are found at null value of \(p\) and \(0 \le e \le 1\).
  8. Bayesian:
    Highest Probability Density (HPD) and two tailed intervals are provided for all \(x = 0, 1, 2 ..n\) based on the conjugate prior beta \((a, b)\) for the probability of success \(p\) of the binomial distribution so that the posterior is beta \((x + a, n - x + b)\).

2.CONFIDENCE INTERVAL- ADJUSTED METHODS

  1. Wald:
    Given data \(x\) and \(n\) are modified as \(x + h\) and \(n + (2*h)\) respectively, where \(h > 0\) then Wald-type interval is applied for all \(x = 0, 1, 2 ..n.\)
  2. Score:
    A score test approach is used after the given data \(x\) and \(n\) are modified as \(x + h\) and \(n + (2*h)\) respectively, where \(h > 0\) and for all \(x = 0, 1, 2 ..n.\)
  3. ArcSine:
    Wald-type interval for the arcsine transformation of the parameter \(p\) for the modified data \(x + h\) and \(n + (2*h)\), where \(h > 0\) and for all \(x = 0, 1, 2 ..n.\)
  4. Logit Wald:
    Wald-type interval for the logit transformation \(log\frac{p}{1-p}\) of the parameter \(p\) for the modified data \(x + h\) and \(n + (2*h)\), where \(h > 0\) and for all \(x = 0, 1, 2 ..n.\)
  5. Wald-t:
    Given data \(x\) and \(n\) are modified as \(x + h\) and \(n + (2*h)\) respectively, where \(h > 0\) then approximate method based on a t_approximation of the standardized point estimator for all \(x = 0, 1, 2 ..n.\)
  6. Likelihood Ratio:
    Likelihood ratio limits for the data \(x + h\) and \(n + (2*h)\) instead of the given \(x\) and \(n\), where \(h\) is a positive integer \((1, 2.)\) and for all \(x = 0, 1, 2 ..n.\)

3.CONFIDENCE INTERVAL- CONTINUITY CORRECTED METHODS

  1. Wald:
    Wald-type interval (for all \(x = 0, 1, 2 ..n\)) using the test statistic \(\frac{|\hat p - p| -c}{SE}\) where \(c > 0\) is a constant for continuity correction
  2. Score:
    A score test approach using the test statistic \(\frac{|\hat p - p| -c}{SE}\) where \(0 < c < 1/(2n)\) is a constant for continuity correction for all \(x = 0, 1, 2 ..n.\)
  3. ArcSine:
    Wald-type interval for the arcsine transformation using the test statistic \(\frac{|sin^{-1}\hat p - sin^{-1}p| -c}{SE}\) where \(c > 0\) is a constant for continuity correction and for all \(x = 0, 1, 2 ..n.\)
  4. Logit Wald:
    Wald-type interval for the logit transformation of the parameter p using the test statistic \(\frac{|L(\hat p) - L(p)| -c}{SE}\) where \(c > 0\) is a constant for continuity correction and \(L(x) = log\frac{x}{1-x}\) for all \(x = 0, 1, 2 ..n.\) Boundary modifications when \(x = 0\) or \(x = n\) using Exact method values.
  5. Wald-t:
    Approximate method based on a t_approximation of the standardized point estimator using the test statistic \(\frac{|\hat p - p|-c}{SE}\) where \(c > 0\) is a constant for continuity correction for all \(x = 0, 1, 2 ..n.\) Boundary modifications when \(x = 0\) or \(x = n\) using Wald adjustment method with \(h = 2\).

Coverage Probability

Coverage Probability
Basic Adjusted Continuity corrected
ArcSine covpAS covpAAS covpCAS
LR covpLR covpALR
Logit covpLT covpALT covpCLT
Score covpSC covpASC covpCSC
Wald-T covpTW covpATW covpCTW
Wald covpWD covpAWD covpCWD
All covpAll covpAAll covpCAll
Bayes covpBA
Exact covpEX
Plotting functions of Coverage Probability
Basic Adjusted Continuity corrected
ArcSine PlotcovpAS PlotcovpAAS PlotcovpCAS
LR PlotcovpLR PlotcovpALR
Logit PlotcovpLT PlotcovpALT PlotcovpCLT
Score PlotcovpSC PlotcovpASC PlotcovpCSC
Wald-T PlotcovpTW PlotcovpATW PlotcovpCTW
Wald PlotcovpWD PlotcovpAWD PlotcovpCWD
All PlotcovpAll PlotcovpAAll PlotcovpCAll
Bayes PlotcovpBA
Exact PlotcovpEX

4.Metric 1:COVERAGE PROBABILITY (Applicable to Base, Adjusted and Continuity Corrected Methods)

  1. Wald:
    Evaluation of Wald-type interval using coverage probability, root mean square statistic, and the proportion of proportion lies within the desired level of coverage
  2. Score:
    Evaluation of score test approach using coverage probability, root mean square statistic, and the proportion of proportion lies within the desired level of coverage
  3. ArcSine:
    Evaluation of Wald-type interval for the arcsine transformation of the parameter \(p\) using coverage probability, root mean square statistic, and the proportion of proportion lies within the desired level of coverage
  4. Logit Wald:
    Evaluation of Wald-type interval based on the logit transformation of \(p\) using coverage probability, root mean square statistic, and the proportion of proportion lies within the desired level of coverage
  5. Wald-t:
    Evaluation of approximate method based on a t_approximation of the standardized point estimator using coverage probability, root mean square statistic, and the proportion of proportion lies within the desired level of coverage
  6. Likelihood Ratio:
    Evaluation of Likelihood ratio limits using coverage probability, root mean square statistic, and the proportion of proportion lies within the desired level of coverage
  7. Exact:
    Evaluation of Confidence interval for p based on inverting equal-tailed binomial tests with null hypothesis H0: p = p0 using coverage probability, root mean square statistic, and the proportion of proportion lies within the desired level of coverage.
  8. Bayesian:
    Evaluation of Bayesian Highest Probability Density (HPD) and two tailed intervals using coverage probability, root mean square statistic, and the proportion of proportion lies within the desired level of coverage for the Beta - Binomial conjugate prior model for the probability of success \(p\).

Length

Sum of length
SumLen Adj-SumLen CC-SumLen
ArcSine lengthAS lengthAAS lengthCAS
LR lengthLR lengthALR
Logit lengthLT lengthALT lengthCLT
Score lengthSC lengthASC lengthCSC
Wald-T lengthTW lengthATW lengthCTW
Wald lengthWD lengthAWD lengthCWD
All lengthAll lengthAAll lengthCAll
Bayes lengthBA
Exact lengthEX
Plotting functions of sum length and expected length (EL)
SumLen EL Adj-SumLen Adj-EL CC-SumLen CC-EL
ArcSine PlotlengthAS PlotexplAS PlotlengthAAS PlotexplAAS PlotlengthCAS PlotexplCAS
LR PlotlengthLR PlotexplLR PlotlengthALR PlotexplALR
Logit PlotlengthLT PlotexplLT PlotlengthALT PlotexplALT PlotlengthCLT PlotexplCLT
Score PlotlengthSC PlotexplSC PlotlengthASC PlotexplASC PlotlengthCSC PlotexplCSC
Wald-T PlotlengthTW PlotexplTW PlotlengthATW PlotexplATW PlotlengthCTW PlotexplCTW
Wald PlotlengthWD PlotexplWD PlotlengthAWD PlotexplAWD PlotlengthCWD PlotexplCWD
All PlotlengthAll PlotexplAll PlotlengthAAll PlotexplAAll PlotlengthCAll PlotexplCAll
Bayes PlotlengthBA PlotexplBA
Exact PlotlengthEX PlotexplEX

4.Metric 2:EXPECTED LENGTH (Applicable to Base, Adjusted and Continuity Corrected Methods)

  1. Wald:
    Evaluation of Wald-type intervals using expected length of the \(n + 1\) intervals
  2. Score:
    Evaluation of score test approach using expected length of the \(n + 1\) intervals
  3. ArcSine:
    Evaluation of Wald-type interval for the arcsine transformation of the parameter p using expected length of the \(n + 1\) intervals
  4. Logit Wald:
    Evaluation of Wald-type interval based on the logit transformation of p using expected length of the \(n + 1\) intervals
  5. Wald-t:
    Evaluation of approximate method based on a t_approximation of the standardized point estimator using expected length of the \(n + 1\) intervals
  6. Likelihood Ratio:
    Evaluation of Likelihood ratio limits using expected length of the \(n + 1\) intervals
  7. Exact:
    Evaluation of Confidence interval for p based on inverting equal-tailed binomial tests with null hypothesis \(H_0: p = p_0\) using expected length of the \(n + 1\) intervals.
  8. Bayesian:
    Evaluation of Bayesian Highest Probability Density (HPD) and two tailed intervals using expected length of the \(n + 1\) intervals for the Beta - Binomial conjugate prior model for the probability of success \(p\).

p-Confidence & p-Bias

p-Confidence & p-Bias
Basic Adjusted Continuity corrected
ArcSine pCOpBIAS pCOpBIAAS pCOpBICAS
LR pCOpBILR pCOpBIALR
Logit pCOpBILT pCOpBIALT pCOpBICLT
Score pCOpBISC pCOpBIASC pCOpBICSC
Wald-T pCOpBITW pCOpBIATW pCOpBICTW
Wald pCOpBIWD pCOpBIAWD pCOpBICWD
All pCOpBIAll pCOpBIAAll pCOpBICAll
Bayes pCOpBIBA
Exact pCOpBIEX
Plotting functions for p-Confidence & p-Bias
Basic Adjusted Continuity corrected
ArcSine PlotpCOpBIAS PlotpCOpBIAAS PlotpCOpBICAS
LR PlotpCOpBILR PlotpCOpBIALR
Logit PlotpCOpBILT PlotpCOpBIALT PlotpCOpBICLT
Score PlotpCOpBISC PlotpCOpBIASC PlotpCOpBICSC
Wald-T PlotpCOpBITW PlotpCOpBIATW PlotpCOpBICTW
Wald PlotpCOpBIWD PlotpCOpBIAWD PlotpCOpBICWD
All PlotpCOpBIAll PlotpCOpBIAAll PlotpCOpBICAll
Bayes PlotpCOpBIBA
Exact PlotpCOpBIEX

5.Metric 3:p-CONFIDENCE, p-BIAS (BASE METHOD)

  1. Wald:
    Evaluation of Wald-type intervals using p-confidence and p-bias for the \(n + 1\) intervals
  2. Score:
    Evaluation of score test approach using p-confidence and p-bias for the \(n + 1\) intervals
  3. ArcSine:
    Evaluation of Wald-type interval for the arcsine transformation of the parameter p using p-confidence and p-bias for the \(n + 1\) intervals
  4. Logit Wald:
    Evaluation of Wald-type interval based on the logit transformation of p using p-confidence and p-bias for the \(n + 1\) intervals
  5. Wald-t:
    Evaluation of approximate method based on a t_approximation of the standardized point estimator using p-confidence and p-bias for the \(n + 1\) intervals
  6. Likelihood Ratio:
    Evaluation of Likelihood ratio limits using p-confidence and p-bias for the \(n + 1\) intervals
  7. Exact:
    Evaluation of Confidence interval for p based on inverting equal-tailed binomial tests with null hypothesis \(H_0: p = p_0\) using p-confidence and p-bias for the \(n + 1\) intervals.
  8. Bayesian:
    Evaluation of Bayesian Highest Probability Density (HPD) and two tailed intervals using p-confidence and p-bias for the \(n + 1\) intervals for the Beta - Binomial conjugate prior model for the probability of success \(p\).

6.Metric 3:p-CONFIDENCE, p-BIAS (Applicable to Base, Adjusted and Continuity Corrected Methods)

  1. Wald:
    Evaluation of adjusted Wald-type interval using p-confidence and p-bias for the \(n + 1\) intervals
  2. Score:
    Evaluation of adjusted score test approach using p-confidence and p-bias for the \(n + 1\) intervals
  3. ArcSine:
    Evaluation of adjusted Wald-type interval for the arcsine transformation of the parameter \(p\) using p-confidence and p-bias for the \(n + 1\) intervals
  4. Logit Wald:
    Evaluation of adjusted Wald-type interval based on the logit transformation of \(p\) using p-confidence and p-bias for the \(n + 1\) intervals
  5. Wald-t:
    Evaluation of approximate and adjusted method based on a t_approximation of the standardized point estimator using p-confidence and p-bias for the \(n + 1\) intervals
  6. Likelihood Ratio:
    Evaluation of adjusted Likelihood ratio limits using p-confidence and p-bias for the \(n + 1\) intervals

Error and long term power

Error and long term power
Basic Adjusted Continuity corrected
ArcSine errAS errAAS errCAS
LR errLR errALR
Logit errLT errALT errCLT
Score errSC errASC errCSC
Wald-T errTW errATW errCTW
Wald errWD errAWD errCWD
All errAll errAAll errCAll
Bayes errBA
Exact errEX
Plotting functions for error and long term power
Basic Adjusted Continuity corrected
ArcSine PloterrAS PloterrAAS PloterrCAS
LR PloterrLR PloterrALR
Logit PloterrLT PloterrALT PloterrCLT
Score PloterrSC PloterrASC PloterrCSC
Wald-T PloterrTW PloterrATW PloterrCTW
Wald PloterrWD PloterrAWD PloterrCWD
All PloterrAll PloterrAAll PloterrCAll
Bayes PloterrBA
Exact PloterrEX

7.Metric 4:ERROR (Applicable to Base, Adjusted and Continuity Corrected Methods)

  1. Wald:
    Evaluation of Wald-type intervals using error due to the difference of achieved and nominal level of significance for the \(n + 1\) intervals
  2. Score:
    Evaluation of score test approach using error due to the difference of achieved and nominal level of significance for the \(n + 1\) intervals
  3. ArcSine:
    Evaluation of Wald-type interval for the arcsine transformation of the parameter \(p\) error due to the difference of achieved and nominal level of significance for the \(n + 1\) intervals
  4. Logit Wald:
    Evaluation of Wald-type interval based on the logit transformation of \(p\) using error due to the difference of achieved and nominal level of significance for the \(n + 1\) intervals
  5. Wald-t:
    Evaluation of approximate method based on a t_approximation of the standardized point estimator using error due to the difference of achieved and nominal level of significance for the \(n + 1\) intervals
  6. Likelihood Ratio:
    Evaluation of Likelihood ratio limits using error due to the difference of achieved and nominal level of significance for the \(n + 1\) intervals
  7. Exact:
    Evaluation of Confidence interval for p based on inverting equal-tailed binomial tests with null hypothesis \(H_0: p = p_0\) using error due to the difference of achieved and nominal level of significance for the \(n + 1\) intervals.
  8. Bayesian:
    Evaluation of Bayesian Highest Probability Density (HPD) and two tailed intervals using error due to the difference of achieved and nominal level of significance for the \(n + 1\) intervals for the Beta - Binomial conjugate prior model for the probability of success \(p\).

8. EVALUATION METHODS FOR GENERAL APPROACH

  1. Evaluation of intervals obtained from any method using coverage probability, root mean square statistic, and the proportion of proportion lies within the desired level of coverage for the \(n + 1\) intervals and pre-defined space for the parameter \(p\) using Monte Carle simulation
  2. Graphical evaluation of intervals obtained from any method using coverage probability, root mean square statistic, and the proportion of proportion lies within the desired level of coverage for the \(n + 1\) intervals and pre-defined space for the parameter \(p\) using Monte Carle simulation

Additional functions

Additional functions
Hypothesis covp length pCOpBI Others Error
hypotestBAF1 covpGEN lengthGEN pCOpBIGEN empericalBA errGEN
hypotestBAF1x PlotcovpGEN PlotlengthGEN PlotpCOpBIGEN empericalBAx
hypotestBAF2x covpSIM lengthSIM probPOSx
hypotestBAF2 PlotcovpSIM PlotlengthSIM probPOS
hypotestBAF3x PlotexplGEN probPREx
hypotestBAF3 PlotexplSIM probPRE
hypotestBAF4x
hypotestBAF4
hypotestBAF5x
hypotestBAF5
hypotestBAF6x
hypotestBAF6

9. Testing of hypothesis and other functions for Bayesian method

  1. EBA:
    Highest Probability Density (HPD) and two tailed intervals are provided for all \(x = 0, 1, 2 ..n\) based on empirical Bayesian approach for Beta-Binomial model. Lower and Upper support values are needed to obtain the MLE of marginal likelihood for prior parameters.
  2. probPRE:
    Computes posterior predictive probabilities for the required size of number of trials (\(m\)) from the given number of trials (n) for the given parameters for Beta prior distribution
  3. hypotestBAF1:
    Computes Bayes factor under Beta-Binomial model for the model: \(H_0: p = p_0\) Vs \(H_A: p \ne p_0\) from the given number of trials n and for all number of successes \(x = 0, 1, 2......n\)
  4. hypotestBAF2:
    Computes Bayes factor under Beta-Binomial model for the model: \(H_0: p = p_0\) Vs \(H_A: p > p_0\) from the given number of trials n and for all number of successes \(x = 0, 1, 2......n\)
  5. hypotestBAF3:
    Computes Bayes factor under Beta-Binomial model for the model: \(H_0: p = p_0\) Vs \(H_A: p < p_0\) from the given number of trials n and for all number of successes \(x = 0, 1, 2......n\)
  6. hypotestBAF4:
    Computes Bayes factor under Beta-Binomial model for the model: \(H_0: p \le p_0\) Vs \(H_A: p > p_0\) from the given number of trials n and for all number of successes \(x = 0, 1, 2......n\)
  7. hypotestBAF5:
    Computes Bayes factor under Beta-Binomial model for the model: \(H_0: p \ge p_0\) Vs \(H_A: p < p_0\) from the given number of trials n and for all number of successes \(x = 0, 1, 2......n\)
  8. hypotestBAF6:
    Computes Bayes factor under Beta-Binomial model for the model: \(H_0: p < p_1\) Vs \(H_A: p > p_2\) from the given number of trials n and for all number of successes \(x = 0, 1, 2......n\)
  9. probPOS:
    Computes probability of the event \(p < p_0\) (\(p_0\) is specified in \([0, 1]\)) based on posterior distribution of Beta-Binomial model with given parameters for prior Beta distribution for all \(x = 0, 1, 2...n\) (\(n\): number of trials)

10. Assistance for reading papers

We have taken six key papers and shown how this package can assist in reproducing the results in these papers. On top of that we have also provided some further areas researchers can gain insight using the package.

Additional functions
# x n Paper Methods Function Additional options
1 20 0 Newcombe Wald ,Score,(both with ciWDx,ciSCx Methods such as Bayesian,Arcsine,
2 29 1 and without CC) Exact ciCWDx, Logit Wald methods; Numerical
3 148 15 and LR for CI ciCSCx, & graphical comparisons of methods
4 263 81 ciEXx,ciLRx Use of general CC and adj. factor
5 10 10 Joseph 2005 Wald and Exact CI ciWDx,ciEXx Bayes factor
6 98 100
7 17 16 Zhou 2008 Wald, Score, ciWDx, ciSCx Other methods such as Bayesian,
8 14 12 Agresti-Coull & ciAWDx Arcsine Logit transformed methods
modified logit for CI Use of general CC and adj. factor
9 167 0 Wei 2012 Score, Agresti-Coull ciSCx, Other classical methods; Numerical
Bayesian(Jeffreys prior) ciAWDx, & graphical comparisons of methods
& other two methods ciBAx Use of general CC and adj. factor
10 109 16 Tuyl 2008 Bayesian method with ciBAx Other classical methods; Numerical
five different beta priors & graphical comparisons of methods
Use of general CC and adj. factor
11 NA 10 Vos 2005 p-confidence, p-bias pCOpBIBA

Paper 1 (Newcombe 1998):

The paper has compared seven methods (Wald, Wald continuity corrected, Likelihood ratio, Score (Wilson), Score, continuity corrected, Clopper Pearson, Mid-P) for Two-sided confidence intervals for the single proportion. Evaluation criteria, Average CP, Aberrations, Zero Width Interval and Non Coverage aspects are considered. Four illustrative data sets have also been provided The package, proportion provides a more comprehensive way of summarizing results similar to the above studies; for example, a function \((ciAllx for n = 20, x = 0)\) from the package yields an easily comparable summary (numerical and graphical) together with other useful measures like existence of aberration, zero width intervals (ZWI). ArcSine and Wald-t methods are additional inclusions; Summaries / Methods which are not readily available elsewhere such as opting with Exact method in a more general way (ciEXx), continuity corrected (ciCAllx), or adding pseudo constants (ciAAllx) in a more general way or Quantile (Q) based and Highest Posterior (H) based CI from Bayesian conjugate method (ciBAx) with an option for specifying any plausible value for the two parameters of prior beta distribution.

Numerical Summaries

Asymptotic methods CI using ciAllx(x=0,n=20,alp=0.05)
method x LowerLimit UpperLimit LowerAbb UpperAbb ZWI
Wald 0 0.0000000 0.0000000 NO NO YES
ArcSine 0 0.0472546 0.0472546 NO NO YES
Likelihood 0 0.0000253 0.0916153 NO NO NO
Score 0 0.0000000 0.1611252 NO NO NO
Logit-Wald 0 0.0000000 0.1684335 NO NO NO
Wald-T 0 0.0000000 0.2440055 YES NO NO
Exact method CI using ciBAx(x=0,n=20,alp=0.05,e=c(0.1,0.5,0.95,1))
x LEXx UEXx LABB UABB ZWI e
0 0 0.0669670 NO NO NO 0.10
0 0 0.1391083 NO NO NO 0.50
0 0 0.1662980 NO NO NO 0.95
0 0 0.1684335 NO NO NO 1.00
Bayesian CI using ciBAx() with x=0,n=20,alp=0.05, varying a(2,1,0.05,0.02 and b(2,1,0.05,2)
Desc x LBAQx UBAQx LBAHx UBAHx
Assuming Symmetry 0 0.0107100 0.2194866 0.0023218 0.1913698
Flat 0 0.0012049 0.1610976 0.0000000 0.1329459
Jeffreys 0 0.0000242 0.1166390 0.0000000 0.0904764
Near boundary 0 0.0000000 0.0089203 0.0000000 0.0021319
Adding Pseudo constant using ciAAllx(x=0,n=20,alp=0.05,h=2)
method x LowerLimit UpperLimit LowerAbb UpperAbb ZWI
Adj-Wald 0 0.0000000 0.1939085 YES NO NO
Adj-ArcSine 0 0.0085880 0.2238858 NO NO NO
Adj-Liklihood 0 0.0143776 0.2357444 NO NO NO
Adj-Score 0 0.0231588 0.2584880 NO NO NO
Adj-Logit Wald 0 0.0209299 0.2788112 NO NO NO
Adj-Wald-T 0 0.0000000 0.2231950 YES NO NO
Adding Continuity Correction, c = 1/(2n) & using ciCAllx(x=0,n=20,alp=0.05,c=1/40)
method x LowerLimit UpperLimit LowerAbb UpperAbb ZWI
Wald 0 0.0000000 0.0250000 YES NO NO
ArcSine 0 0.0584251 0.0584251 NO NO YES
Score 0 0.0045747 0.2004533 NO NO NO
Logit Wald 0 0.0000000 0.1684335 NO NO NO
Wald-T 0 0.0000000 0.2690055 YES NO NO

Graphical Summaries

# 17. Paper 1
PlotciAllx(x=0,n=20,alp=0.05)

Corresponding comparison for sum of length of CI can be obtained as below

# 18. Plot of sum of length of exact method
PlotlengthEX(n=10,alp=0.05,e=c(0.1,0.5,0.95,1),a=1,b=1) 

In the case of other evaluation criteria, package proportion provides ample scope for comparing competing methods. Following table and plots illustrate for n = 250 (inspired from n = 263) using the functions \(covpAll(n,alp,a1,b1)\) and \(PlotcovpAll(n,alp,a1,b1)\)

Coverage probability using covpAll()
method MeanCP MinCP RMSE_N RMSE_M RMSE_MI tol
Wald 0.9364816 0.0008363 0.0634224 0.0619649 0.9376949 91.06
ArcSine 0.9411163 0.0176970 0.0604679 0.0598118 0.9253544 95.48
Lilelihood 0.9489788 0.0000000 0.0160055 0.0159729 0.9491132 96.72
Score 0.9506135 0.8516555 0.0056199 0.0055863 0.0991156 98.86
WaldLogit 0.9521187 0.8859012 0.0062506 0.0058806 0.0664781 98.02
Wald-T 0.9497880 0.9237747 0.0114848 0.0114829 0.0284351 94.70

For a more comparative case consider similar plot for \(n = 10\)

# 20. Paper 1
PlotcovpAll(n=10,alp=0.05,a=1,b=1,t1=0.93,t2=0.97)

Paper 2 (Joseph and Reinfold 2005):

A tutorial kind of article pertaining to obtain CI based on inverting two tailed tests involving single proportion is available in Joseph and Reinfold 2005. This mainly deals with Wald large sample and Exact methods for CI and hypothesis testing involving values near boundary of p. Only interval for Wald is avialble in the paper however, comparison of procedure would enhance the presentation and purpose. One way is through a pictorial output can be improved further by sorting the CI for each \(x = 0, 1,..,n\) using a function \(PlotciAllg(n,alp)\).

# 21. Paper 2 - display the function
PlotciAllg(n=10,alp=0.05)

Further, one of the significant features of the package is readily available Bayesian testing alternatives involving single binomial proportion (p). For example data from this paper involves a classical testing \(H_0: p \le 0.9\) vs. \(H_1: p > 0.9\), Bayes factor can be calculated using the function \(hypotestBAF4(x,n,th0,a0,b0,a1,b1)\). (Six functions are available for the exhaustive possibilities of testing hypotheses on p). Numerical result for this data under the assumption that uniform and Jeffreys prior for null and alternate models respectively is 0.0832, which is evident to reject \(H_0\).

Additionally this package has an option (like hypotestBAF4) to compare Bayes factor for all possible values of x (such as the one listed below) so as to understand the possible change in the values of Bayes factor in turn the decision.

# 20. Paper 2 - display the function
hypotestBAF4(n=10, th0=0.9, a0=1,b0=1,a1=0.5,b1=0.5)
Hypothesis test, H0: p <= 0.9 vs. H1: p > 0.9
x BaFa01 Interpretation
0 3.987416e+10 Evidence against H1 is very strong
1 2.088687e+08 Evidence against H1 is very strong
2 3.619818e+06 Evidence against H1 is very strong
3 1.165063e+05 Evidence against H1 is very strong
4 5.924426e+03 Evidence against H1 is very strong
5 4.440244e+02 Evidence against H1 is very strong
6 4.749817e+01 Evidence against H1 is strong
7 7.123142e+00 Evidence against H1 is positive
8 1.449043e+00 Evidence against H1 is not worth more than a bare mention
9 3.619376e-01 Evidence against H0 is not worth more than a bare mention
10 8.319770e-02 Evidence against H0 is positive

Paper 3 (Zhou et al 2008) and 4 (Wei Yu et al 2012):

The main objective of Zhou et al 2008 is to improve logit Wald method and the method has been illustrated with \(x = 16\) and \(n = 17\). Similarly, Wei Yu et al 2012 have attempted an improvement for Score method with a real data example (\(x = 16, n = 109\)). Further, two adjustment methods can easily be compared with other adjustment methods using the options available from the package ciAAllx. Intentionally the adjustment factor (h) is taken as zero to compare with original results of respective studies. Such comparison is pervasive in a statistical investigation involving a parameter, particularly for p.

# 20. Function to evaluate ci varying the adding constant h
ciAAllx(x=16, n=17,alp = 0.05,h=0)
ciAAllx(x=16, n=109,alp = 0.05,h=0)

The full results are shown below with \(h\) values of 0,1 and 2.

CI with x=16, n=17 & h=0
method x LowerLimit UpperLimit LowerAbb UpperAbb ZWI
Adj-Wald 16 0.8293268 1.0000000 NO YES NO
Adj-ArcSine 16 0.7845775 0.9999467 NO NO NO
Adj-Liklihood 16 0.7656344 0.9965448 NO NO NO
Adj-Score 16 0.7301797 0.9895396 NO NO NO
Adj-Logit Wald 16 0.6796805 0.9917795 NO NO NO
Adj-Wald-T 16 0.7478459 1.0000000 NO YES NO
CI with x=16, n=109 & h=0
method x LowerLimit UpperLimit LowerAbb UpperAbb ZWI
Adj-Wald 16 0.0803520 0.2132260 NO NO NO
Adj-ArcSine 16 0.0869474 0.2190422 NO NO NO
Adj-Liklihood 16 0.0888655 0.2211349 NO NO NO
Adj-Score 16 0.0924191 0.2252076 NO NO NO
Adj-Logit Wald 16 0.0919145 0.2262616 NO NO NO
Adj-Wald-T 16 0.0788708 0.2147072 NO NO NO
CI with x=16, n=17 & h=1
method x LowerLimit UpperLimit LowerAbb UpperAbb ZWI
Adj-Wald 16 0.7567438 1.0000000 NO YES NO
Adj-ArcSine 16 0.7221104 0.9888902 NO NO NO
Adj-Liklihood 16 0.7090774 0.9816599 NO NO NO
Adj-Score 16 0.6860592 0.9706414 NO NO NO
Adj-Logit Wald 16 0.6626006 0.9735380 NO NO NO
Adj-Wald-T 16 0.7243814 1.0000000 NO YES NO
CI with x=16, n=109 & h=1
method x LowerLimit UpperLimit LowerAbb UpperAbb ZWI
Adj-Wald 16 0.0861567 0.2201496 NO NO NO
Adj-ArcSine 16 0.0925269 0.2257484 NO NO NO
Adj-Liklihood 16 0.0943876 0.2277104 NO NO NO
Adj-Score 16 0.0978748 0.2316357 NO NO NO
Adj-Logit Wald 16 0.0973834 0.2326298 NO NO NO
Adj-Wald-T 16 0.0847926 0.2215137 NO NO NO
CI with x=16, n=17 & h=2
method x LowerLimit UpperLimit LowerAbb UpperAbb ZWI
Adj-Wald 16 0.7074793 1.0000000 NO YES NO
Adj-ArcSine 16 0.6798301 0.9701145 NO NO NO
Adj-Liklihood 16 0.6701048 0.9624273 NO NO NO
Adj-Score 16 0.6536394 0.9501899 NO NO NO
Adj-Logit Wald 16 0.6386495 0.9532031 NO NO NO
Adj-Wald-T 16 0.6887962 1.0000000 NO YES NO
CI with x=16, n=109 & h=2
method x LowerLimit UpperLimit LowerAbb UpperAbb ZWI
Adj-Wald 16 0.0918193 0.2267648 NO NO NO
Adj-ArcSine 16 0.0979757 0.2321580 NO NO NO
Adj-Liklihood 16 0.0998091 0.2340433 NO NO NO
Adj-Score 16 0.1032005 0.2377869 NO NO NO
Adj-Logit Wald 16 0.1027218 0.2387274 NO NO NO
Adj-Wald-T 16 0.0905595 0.2280246 NO NO NO

To compare the length of the intervals for the data \(x = 16, n = 17\), a graphical form can be obtained from the package using

# 21. Paper 3&4 - Plot of all the adjusted CI with h=1
PlotciAAllxg(x=16,n=17,alp=0.05,h=1)

As can be seen above the grouping function (ending with g) convinently orders the results within each value of x.

Another aspect is the way Exact method has been handled; based on the extensive studies for adjusting Exact method, this package confines to randomized test using the constant e in [0, 1]. Example 6 (Joseph and Reinfold 2005- see table above) may be reproduced with the function ciEXx as shown below.

# 22. Paper 3&4 - display the function
ciEXx(x=98, n=100,alp = 0.05,e=c(0.1,.5,0.95,1))
CI-Exact with x=98, n=100
x LEXx UEXx LABB UABB ZWI e
98 0.9429629 0.9946812 NO NO NO 0.10
98 0.9355021 0.9966313 NO NO NO 0.50
98 0.9300850 0.9975033 NO NO NO 0.95
98 0.9295962 0.9975662 NO NO NO 1.00

Paper 5 (Tuyl et al 2008):

This paper has compared difference non-informative priors with an informative prior based on an earlier study for single binomial proportion with a real data set \(x = 0, n = 167\). This is one of most often cited examples for zero successes which have witnessed active research. The predictive density based comparison has been carried out to emphasize a specific prior assumption. This package provides readily available options in Bayesian computation using posterior predictive distributions for a wider comparison and probabilities. A quick comparison using Uniform prior for zero successes or possibility for \(p = 0.5\) can be explored using the function \(probPREx(x,n,xnew,m,a1,a2)\). The variable xnew and m varies, keeping x=0, n=167, a1=a2=1.

Predicted probability with x=0, n=167 varying xnew and m
x n xnew m preprb
0 167 0 10 0.9438202
0 167 0 50 0.7706422
0 167 0 100 0.6268657
0 167 0 150 0.5283019
0 167 5 10 0.0000002
0 167 25 50 0.0000000
0 167 50 100 0.0000000
0 167 75 150 0.0000000

Assuming that the example depicts a rare event, an analysis with posterior probabilities would enhance the analysis when the function \(probPOSx(x,n,a,b,th)\) is used

Guidance for priors used below
Description a b
Uniform prior 1.000 1.00
Jeffreys prior 0.500 0.50
Tuyl p1 0.042 27.96
Tuyl p2 1.000 666.00
Tuyl p3 1.000 398.00
Posterior probability with x=0, n=167 varying th
Uniform Prior Jeffreys prior Tuyl p1 Tuyl p2 Tuyl p3
th=0.001 0.1547172 0.4370766 0.9479747 0.5654381 0.4318005
th=0.01 0.8151954 0.9332767 0.9976693 0.9997687 0.9965811
th=0.1 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
th=0.5 0.9998190 0.9999656 0.9999998 1.0000000 1.0000000

Also literature often compiles the frequentist evaluation criteria for Bayesian methods too and hence this package includes most prominent methods as well as other measures as a sign of enlarging the scope of comparison.

Paper 6 (Vos and Hudson 2005):

The p-confidence and p-Bias from Vos and Hudson (2005) and the result for p-confidence and p-bias for two types of Bayesian CI for \(n = 10\) using \(pCOpBIBA(n,alp,a1,a2)\) is

p-Confidence & p-Bias of Bayesian method for n=10, a1=a2=1
x1 pconfQ pbiasQ pconfH pbiasH
1 58.75446 33.416269 86.04498 1.7034458
2 76.33893 15.007198 86.58091 1.4986566
3 82.59405 7.832379 87.14676 1.0040862
4 85.83952 3.477491 87.57388 0.4968235
5 87.87691 0.000000 87.87691 0.0000000
6 85.83952 3.477491 87.57388 0.4968235
7 82.59405 7.832379 87.14676 1.0040859
8 76.33893 15.007198 86.58091 1.4986566
9 58.75446 33.416269 86.04498 1.7034458

References

[1] 1998 Newcombe RG. Two-sided confidence intervals for the single proportion: Comparison of seven methods. Statistics in Medicine: 17; 857 - 872.

[2] 2005 Joseph L and Reinhold C. Statistical Inference for Proportions American Journal of Radiologists 184; 1057 - 1064

[3] 2008 Zhou, X. H., Li, C.M. and Yang, Z. Improving interval estimation of binomial proportions. Phil. Trans. R. Soc. A, 366, 2405-2418

[4] 2012 Wei Yu, Xu Guo and Wangli Xua. An improved score interval with a modified midpoint for a binomial proportion, Journal of Statistical Computation and Simulation, 84, 5, 1-17