Basic Genetics

This vignette will introduce you to how the basic genetic parameters like the allele frequency, the genotype frequency and Hardy-Weinberg Equilibrium results are calculated with mixIndependR.

Data Import

The dataset imported should be in a format of the bi-allele data, transformed from vcf file format, Gen or other genotype data files.

#>    STR1 STR1_1 SNP1 SNP1_1
#> 1    12     12    A      A
#> 2    13     14    T      T
#> 3    13     13    A      T
#> 4    14     15    A      T
#> 5    15     13    T      A
#> 6    13     14    A      T
#> 7    14     13    A      A
#> 8    12     12    T      A
#> 9    14     14    T      T
#> 10   15     15    A      T

Basic Genetic Parameters

library(mixIndependR)

AlleleFreq calculates the allele frequencies for one dataset.

AlleleFreq(x)
#>    STR1 SNP1
#> 12  0.2  0.0
#> 13  0.3  0.0
#> 14  0.3  0.0
#> 15  0.2  0.0
#> A   0.0  0.5
#> T   0.0  0.5

GenotypeFreq calculates the observed or expected genotype frequency. If expect=FALSE, the observed genotype frequencies from the original dataset will be calculated. If expected=TRUE, the expected genotype probabilities from allele frequency table under Hardy-Weinberg Equilibrium will be exported.

p <- AlleleFreq(x)
GenotypeFreq(x,p,expect = FALSE)
#>       STR1 SNP1
#> 12 12    2    0
#> 13 13    1    0
#> 14 14    1    0
#> 15 15    1    0
#> A A      0    2
#> T T      0    2
#> 12 13    0    0
#> 12 14    0    0
#> 12 15    0    0
#> 12 A     0    0
#> 12 T     0    0
#> 13 14    3    0
#> 13 15    1    0
#> 13 A     0    0
#> 13 T     0    0
#> 14 15    1    0
#> 14 A     0    0
#> 14 T     0    0
#> 15 A     0    0
#> 15 T     0    0
#> A T      0    6
GenotypeFreq(x,p,expect = TRUE)
#>       STR1 SNP1
#> 12 12 0.04 0.00
#> 13 13 0.09 0.00
#> 14 14 0.09 0.00
#> 15 15 0.04 0.00
#> A A   0.00 0.25
#> T T   0.00 0.25
#> 12 13 0.12 0.00
#> 12 14 0.12 0.00
#> 12 15 0.08 0.00
#> 12 A  0.00 0.00
#> 12 T  0.00 0.00
#> 13 14 0.18 0.00
#> 13 15 0.12 0.00
#> 13 A  0.00 0.00
#> 13 T  0.00 0.00
#> 14 15 0.12 0.00
#> 14 A  0.00 0.00
#> 14 T  0.00 0.00
#> 15 A  0.00 0.00
#> 15 T  0.00 0.00
#> A T   0.00 0.50

Heterozygous test the heterozygosity of each individuals at each locus and output a table with 0 denoting homozygous and 1 heterozygous.

h <-Heterozygous(x)
print(h)
#>       STR1 SNP1
#>  [1,]    0    0
#>  [2,]    1    0
#>  [3,]    0    1
#>  [4,]    1    1
#>  [5,]    1    1
#>  [6,]    1    1
#>  [7,]    1    0
#>  [8,]    0    1
#>  [9,]    0    0
#> [10,]    0    1

RxpHetero calculate Real or Expected Average Heterozygosity at each locus. If HWE=TRUE, this function will calculate the expected heterozygosities under Hardy-Weinberg Equilibrium; If HWE=FALSE, this function will calculate the real average heterozygosities.

H <- RxpHetero(h,p,HWE=TRUE)
head(H)
#> STR1 SNP1 
#> 0.74 0.50

AlleleShare_Table calculates the table of number of shared alleles for each pair of individuals at each locus.If replicate=TRUE, the pairs are formed with replicates; if replicate=FALSE, the pairs are formed without replicate.

AS<-AlleleShare_Table(x,replicate=TRUE)
head(AS)
#>      STR1 SNP1
#> 1vs2    0    0
#> 1vs3    0    1
#> 1vs4    0    1
#> 1vs5    0    1
#> 1vs6    0    1
#> 1vs7    0    2

RealProAlleleShare and ExpProAllelShare calculate the average proportions and the expected probabilities of sharing 0,1 and 2 alleles at each locus.

e <-RealProAlleleShare(AS)
e0<-ExpProAlleleShare(p)
head(e)
#>              P0        P1         P2
#> STR1 0.53333333 0.3777778 0.08888889
#> SNP1 0.08888889 0.5333333 0.37777778
head(e0)
#>         P0     P1     P2
#> STR1 0.317 0.5672 0.1158
#> SNP1 0.125 0.5000 0.3750

HWE_Chisq and HWE_Fisehr test the Hardy-Weinberg Equilibrium with Pearson’s Chi-square test or Fisher’s exact test. B is an integer specifying the number of replicates used in the Monte Carlo test.

g <- GenotypeFreq(x,p,expect=FALSE)
g0 <- GenotypeFreq(x,p,expect=TRUE)
HWE.Chisq(g,g0,rescale.p=FALSE,simulate.p.value=TRUE,2000)
#> $chi
#>     STR1     SNP1 
#> 4.544444 0.400000 
#> 
#> $pvalue
#>      STR1      SNP1 
#> 0.6266867 0.8515742
HWE.Fisher(p,H,g/colSums(g))
#>      STR1      SNP1 
#> 0.8173266 0.9387241