EMfit_betabinom_robust
's fitted modelLikelyDistsHet.Rd
LikelyDistsHet
is mainly for internal use by EMfit_betabinom_robust
. It assists in robustifying the function's EM-fit by iteratively re-fitting
the model on the entire input dataset except for one point, after which the difference in heterozygous pi- and theta-estimates and likelihoods is logged.
The difference with their respective full-data fit counterparts is a measure for the left-out data point's influence on the model fit;
if either one is sufficiently high the data point could be considered an outlier.
LikelyDistsHet(
ref_counts,
var_counts,
sprv,
parvec_cur,
NoSplitHet,
ResetThetaMin,
ResetThetaMax,
SE,
ReEstPars = FALSE
)
Numeric vector. reference counts.
Numeric vector. variant counts.
Numeric vector. Each sample's EM-weight reflecting its likelihood to be part of the heterozygous population.
Numeric vector. Pi and theta (in that order) of the heterozyous peak of the full-data fit.
Logical. If TRUE, don't allow the beta-binomial fit for heterozygotes to be bimodal
Number. Initial theta values in numeric optimization get capped at this minimum (e.g. in case the moment estimate is even lower)
Number. Initial theta values in numeric optimization get capped at this maximum (e.g. in case the moment estimate is even higher)
Number. Sequencing error rate.
Logical. If TRUE, re-estimates parvec_cur
given ref_counts
and var_counts
. This is useless if these are the actual
counts of the full dataset, but are useful for an emperical approach in which "expected" parameter- and likelihood-distances if the assumed model is 100
correct are simulated by drawing ref_counts
and var_counts
from this assumed model (see EMfit_betabinom_robust
)
A list containing the following components:
A vector containing likelihood distances per sample (2 times full-data log-likelihood minus re-fitted log-likelihood leaving out the sample).
A vector containing pi distances per sample (leave-sample-out refitted pi minus full-data pi).
A vector containing theta distances per sample (leave-sample-out refitted theta minus full-data theta).