eqdist.etest {energy}R Documentation

Multisample E-statistic (Energy) Test of Equal Distributions

Description

Performs the nonparametric multisample E-statistic (energy) test for equality of multivariate distributions.

Usage

 eqdist.etest(x, sizes, distance = FALSE, 
              incomplete = FALSE, N = 100, R = 999)

Arguments

x data matrix of pooled sample
sizes vector of sample sizes
distance logical: if TRUE, first argument is a distance matrix
incomplete logical: if TRUE, compute incomplete E-statistics
N sample size for incomplete statistics
R number of bootstrap replicates

Details

The k-sample multivariate E-test of equal distributions is performed. The statistic is computed from the original pooled samples, stacked in matrix x where each row is a multivariate observation, or the corresponding distance matrix. The first sizes[1] rows of x are the first sample, the next sizes[2] rows of x are the second sample, etc.

The test is implemented by nonparametric bootstrap, an approximate permutation test with R replicates. For large samples it is more efficient if x contains the data matrix rather than the distances. Incomplete statistics are supported for the two-sample test. If incomplete==TRUE, at most N observations from each sample (by sampling without replacement) are used in the calculation of the statistic. If distance==TRUE complete statistics are always computed.

The definition of the multisample E-statistic is given in the ksample.e documentation.

Value

A list with class etest.eqdist containing

method Description of test
statistic Observed value of the test statistic
p.value Approximate p-value of the test
sizes Vector of sample sizes
R Number of replicates
replicates Vector of replicates of the statistic

Author(s)

Maria L. Rizzo rizzo@math.ohiou.edu and Gabor J. Szekely gabors@bgnet.bgsu.edu

References

Szekely, G. J. and Rizzo, M. L. (2003) Testing for Equal Distributions in High Dimension, submitted.

Szekely, G. J. (2000) E-statistics: Energy of Statistical Samples, preprint.

See Also

ksample.e, print.etest.eqdist edist energy.hclust

Examples

 data(iris)
 
 ## test if the 3 varieties of iris data (d=4) have equal distributions
 eqdist.etest(iris[,1:4], c(50,50,50))

 ## compare incomplete versions of two sample test
 x <- c(rpois(400, 2), rnbinom(600, size=1, mu=2))
 eqdist.etest(x, c(400, 600), incomplete=TRUE, N=100)
 eqdist.etest(x, c(400, 600), incomplete=TRUE, N=200)
  


[Package Contents]