dAD_analysis.Rd
dAD_analysis
is a wrapper function bundling EMfit_betabinom_robust
, EMfit_betabinom
, EMfit_betabinom_popcomb
,
pmf_betabinomMix
, and HWE_chisquared
to run the entire differential Allelic Dispersion detection analysis as discussed in the package vignette.
To this end, it goes over the following steps:
Calling EMfit_betabinom_robust
on both the control- and case-dataset separately for the sole purpose of outlier detection;
detected outliers in both datasets play no part in future Expectation-Maximization-fits.
Perform one fit on the joint (case+control) data using EMfit_betabinom
, i.e. assuming all population parameters are shared including
the heterozygous overdispersion parameter, i.e. assuming no differential Allelic Dispersion.
Perform a fit on the control- and case-data using EMfit_betabinom_popcomb
, which assumes all population parameters are shared
except for the heteroyzous overdispersion parameter, i.e. a fit accomodating differential Allelic Dispersion
Statistically test for differential Allelic Dispersion using a likelihood ratio test with one degree of freedom comparing the fits from the two previous points.
Filling out the results dataframe dAD_res
.
dAD_analysis(
datas,
SE,
inbr,
allelefreq = 0.5,
dltaco = 10^-6,
HWE = FALSE,
p_InitEst = FALSE,
p_inits = c(1/3, 1/3, 1/3),
ThetaInits = "moment",
ReEstThetas = "moment",
NoSplitHom = TRUE,
NoSplitHet = TRUE,
ResetThetaMin = 10^-10,
ResetThetaMax = 10^-1,
DistRob = "Cook",
CookMargin = 5,
LikEmpNum = 1000,
LikMargin = 0,
NumHetMin = 5,
MaxOutFrac = 0.5,
thetaTRY = c(10^-1, 10^-3, 10^-7),
probshift_init = 0.5,
FirstFewFixed = NULL,
MemLim = 2048,
Xtra = 5,
epsabs = 0.001,
MaxIt = 100
)
A list containing two lists of dataframes, which in turn contain control- and case-data (in that order). Subsequent entries in both lists
correspond to subsequent loci (lists must be the same size and named per-locus). Each of the dataframes should at least contain both a ref_count
and
var_count
column.
Number. Sequencing error rate population metaparameter.
Number. Inbreeding coefficient population metaparameter.
Remaining
control parameters to be passed to EMfit_betabinom_robust
, EMfit_betabinom
, and EMfit_betabinom_popcomb
. See their respective documentation
for more information. Usually, these parameter's default values are fine.
A dataframe with one row per locus, containing for that locus in its columns:
Locus name.
Gene name.
Raw likelihood ratio test p-value testing for differential allelic dispersion between controls and cases (i.e. a difference between ThetaHetC and ThetaHetT).
Estimated reference homozygote mixture component in the beta-binomial mixture dAD fit.
Estimated heterozygote mixture component in the beta-binomial mixture dAD fit.
Estimated variant homozygote mixture component in the beta-binomial mixture dAD fit.
Estimated heterozygous pi parameter (assumed equal in controls and cases) in the beta-binomial mixture dAD fit.
Estimated control heterozygous theta parameter in the beta-binomial mixture dAD fit.
Estimated case heterozygous theta parameter in the beta-binomial mixture dAD fit.
Estimated control heterozygous rho parameter (0-to-1 rescaling of theta) in the beta-binomial mixture dAD fit.
Estimated case heterozygous rho parameter (0-to-1 rescaling of theta) in the beta-binomial mixture dAD fit.
Estimated homozygous theta parameter in the beta-binomial mixture dAD fit.
Estimated reference homozygote mixture component in the beta-binomial mixture fit assuming no dAD.
Estimated heterozygote mixture component in the beta-binomial mixture fit assuming no dAD.
Estimated variant homozygote mixture component in the beta-binomial mixture fit assuming no dAD.
Estimated heterozygous pi parameter (assumed equal in controls and cases) in the beta-binomial mixture fit assuming no dAD.
Estimated heterozygous theta parameter in the beta-binomial mixture fit assuming no dAD.
Estimated heterozygous rho parameter (0-to-1 rescaling of theta) in the beta-binomial mixture fit assuming no dAD.
Estimated homozygous theta parameter in the beta-binomial mixture fit assuming no dAD.
Estimated reference homozygote mixture component in the beta-binomial mixture fit on control samples alone.
Estimated heterozygote mixture component in the beta-binomial mixture fit on control samples alone.
Estimated variant homozygote mixture component in the beta-binomial mixture fit on control samples alone.
Estimated heterozygous pi parameter in the beta-binomial mixture fit on control samples alone.
Estimated heterozygous theta parameter in the beta-binomial mixture fit on control samples alone.
Estimated homozygous theta parameter in the beta-binomial mixture fit on control samples alone.
Estimated reference homozygote mixture component in the beta-binomial mixture fit on case samples alone.
Estimated heterozygote mixture component in the beta-binomial mixture fit on case samples alone.
Estimated variant homozygote mixture component in the beta-binomial mixture fit on case samples alone.
Estimated heterozygous pi parameter in the beta-binomial mixture fit on case samples alone.
Estimated heterozygous theta parameter in the beta-binomial mixture fit on case samples alone.
Estimated homozygous theta parameter in the beta-binomial mixture fit on case samples alone.
Estimated number of heterozygotes in controls, according to a fit on controls alone. Outliers, which are detected at the time of this fit, are counted towards this number, but not used in any following fitting procedures.
Estimated number of heterozygotes in cases, according to a fit on cases alone. Outliers, which are detected at the time of this fit, are counted towards this number, but not used in any following fitting procedures.
The RobFlag
output of the separate outlier-detection fit on controls, see EMfit_betabinom_robust
's documentation.
The RobFlag
output of the separate outlier-detection fit on cases, see EMfit_betabinom_robust
's documentation.
HWE chi squared p-value for controls, see HWE_chisquared
. Performed on the outlier-detection fit on controls alone.
HWE chi squared p-value for cases, see HWE_chisquared
. Performed on the outlier-detection fit on cases alone.
Mean coverage (reference + variant count) across control data.
Mean coverage (reference + variant count) across case data.
Median coverage (reference + variant count) across control data.
Median coverage (reference + variant count) across case data.
Number of outliers detected in control data.
Number of outliers detected in case data.
Number of expectation-maximization iterations run during the beta-binomial mixture fit assuming no dAD.
Number of expectation-maximization iterations run during the beta-binomial dAD mixture fit.
Quality flag returned by EMfit_betabinom_robust
(see documentation) of the fit on control data alone.
Quality flag returned by EMfit_betabinom_robust
(see documentation) of the fit on case data alone.