-vcf [file]
File path of the isolate vcf. Assume all variants are PASS
in the QUAL
column, the VCF file also reqires the AD
field.
Note: In the current implementation, DEploid
only take the first sample in the VCF file. DEploid
DO NOT handle multi-allelic variants, nor indels. The FILTER
column will not be used.
-plaf [file]
File path of population level allele frequencies (tab-delimited plain text file), for example
CHROM | POS | PLAF |
---|---|---|
Pf3D7_01_v3 | 93157 | 0.0190612159917058 |
Pf3D7_01_v3 | 94422 | 0.135502358766423 |
Pf3D7_01_v3 | 94459 | 0.156294363760064 |
Pf3D7_01_v3 | 94487 | 0.143439298925837 |
-panel [file]
File path of the reference panel (tab-delimited plain text file), for example
CHROM | POS | 3D7 | Dd2 | Hb3 | 7G8 |
---|---|---|---|---|---|
Pf3D7_01_v3 | 93157 | 0 | 0 | 0 | 1 |
Pf3D7_01_v3 | 94422 | 0 | 0 | 0 | 1 |
Pf3D7_01_v3 | 94459 | 0 | 0 | 0 | 1 |
Pf3D7_01_v3 | 94487 | 0 | 0 | 0 | 1 |
-noPanel
Use population level allele frequency as prior.
Warning: Flags -panel
and -noPanel
should not be used together.
‘-exclude [file]’ File path of sites to be excluded (tab-delimited plain text file).
‘-o [string]’ Specify the file name prefix of the output.
‘-k [int]’ Number of strain (default value 5).
‘-nSample [int]’ Number of MCMC samples (default value 800).
‘-rate [int]’ MCMC sample rate (default value 5).
‘-burn [float]’ MCMC burn rate (default value 0.5).
‘-v , -version’ DEploid version.
‘-vcfOut’ Save final halpotypes into a VCF file.
‘-ref [file] -alt [file]’ File path of reference and alternative allele count (tab-delimited plain text file).
Note: In early dEploid
versions (prior to v0.2-release
), allele counts extracted from the vcf file are placed in two files, and parsed by flags -ref [file]
and -alt [file]
. Tab-delimited plain text for input. First and second columns record chromosome and position labels respectively. Third columns records the reference allele count or alternative allele count. For example,
CHROM | POS | PG0390.C |
---|---|---|
Pf3D7_01_v3 | 93157 | 85 |
Pf3D7_01_v3 | 94422 | 77 |
Pf3D7_01_v3 | 94459 | 90 |
Pf3D7_01_v3 | 94487 | 79 |
CHROM | POS | PG0390.C |
---|---|---|
Pf3D7_01_v3 | 93157 | 0 |
Pf3D7_01_v3 | 94422 | 0 |
Pf3D7_01_v3 | 94459 | 0 |
Pf3D7_01_v3 | 94487 | 0 |
Warning: Flags -ref
and -alt
should not be used with -vcf
.
-forbidUpdateProp
Forbid MCMC moves to update proportions.
-forbidUpdateSingle
Forbid MCMC moves to update single haplotype.
-forbidUpdatePair
Forbid MCMC moves to update pair haplotypes.
-exportPostProb
Save the posterior probabilities of the final iteration of all strains.
-miss [float]
Miss copying probability
-recomb [float]
Constant recombination probability
-initialP [float ...]
Initialize proportions.
-p [int]
Output precision (default value 8).
Data exploration, plot the read count ALT
vs REF
.
library(DEploid)
vcfFile = system.file("extdata", "PG0390-C.test.vcf.gz", package = "DEploid")
PG0390 = extractCoverageFromVcf(vcfFile)
plotAltVsRef( PG0390$refCount, PG0390$altCount )
Plot the histogram of the observed allele frequency within sample.
obsWSAF = computeObsWSAF( PG0390$altCount, PG0390$refCount )
histWSAF(obsWSAF)
Load prior information: PLAF and reference panel
plafFile = system.file("extdata", "labStrains.test.PLAF.txt", package = "DEploid")
plaf = extractPLAF(plafFile)
panelFile = system.file("extdata", "labStrains.test.panel.txt", package = "DEploid")
Deconvolute the haplotypes
set.seed(1)
PG0390.deconv = dEploid(paste("-vcf", vcfFile, "-plaf", plafFile, "-panel", panelFile))
prop = PG0390.deconv$Proportions[dim(PG0390.deconv$Proportions)[1],]
expWSAF = t(PG0390.deconv$Haps) %*% prop
Plot the allele frequency within sample (observed in red, expected in blue) against the population level allele frequency.
plotWSAFvsPLAF(plaf, obsWSAF, expWSAF)
Plot the history of the MCMC proportion estimates.
plotProportions(PG0390.deconv$Proportions)
Plot the allele frequency within sample, observed against expected.
plotObsExpWSAF(obsWSAF, expWSAF)