AllelicMeta_est.Rd
AllelicMeta_est
calculates sequencing error rates, inbreeding
coefficients, allele frequencies, genotypes and genotype probabilities from ref_counts
and var_counts
for a specific locus. To this end it fits a (regular) binomial mixture model with no allelic bias in heterozygotes (both alleles equally likely).
This somewhat simplistic model (no overdispersion parameter, no allelic bias) makes for an inferior but very fast fit,
which is ideal to get rough estimates of the allelic population metaparameters (sequencing error rate and inbreeding coefficient) by e.g. taking their mean or median across a large amount of loci.
The per-sample allele frequencies and genotype (probabilities) are less reliable and, as such, not this function's main use.
AllelicMeta_est(
ref_counts,
var_counts,
deltaF = 10^-8,
maxIT = 100,
SE_prior = 0.002,
F_inbr_prior = NULL,
HetProb = 0.5
)
Numeric vector. Reference counts.
Numeric vector. Variant counts.
Number. Expectation-Maximisation threshold, minimal difference between two consecutive iterations (default is 1e-08).
Number. Maximum number of iterations of the Expectation-Maximisation algorithm (default is 100).
Number. Initial estimate of the sequencing error rate (default is 0.002).
Number. Initial estimate of the inbreeding coefficient used for calculating initial genotype frequencies (default is NULL, in which case the initial genotype frequences all get set to 1/3).
Number. Allelic bias in heterozygotes (expected reference over total allele count in RNAseq data; default is 0.5
A list containing the following components:
The estimated allele frequency.
The estimated sequencing error rate.
The estimated inbreeding coefficient.
The most likely genotype (rr, rv, vv) of each sample.
The genotype probabilties (p(rr), p(rv), p(vv)) for each sample.
The number of iterations
AllelicMeta_est(c(5, 8, 10, 3, 5, 6, 23), c(8, 8, 6, 4, 4, 10, 0))
#> $allelefreq
#> [1] 0.5714285
#>
#> $SE
#> [1] 0
#>
#> $F_inbr
#> [1] -0.7500002
#>
#> $genotypes
#> [1] "rv" "rv" "rv" "rv" "rv" "rv" "rr"
#>
#> $genoprobs
#> p.rr. p.rv. p.vv.
#> 1 0.0000000 1.000000e+00 0
#> 2 0.0000000 1.000000e+00 0
#> 3 0.0000000 1.000000e+00 0
#> 4 0.0000000 1.000000e+00 0
#> 5 0.0000000 1.000000e+00 0
#> 6 0.0000000 1.000000e+00 0
#> 7 0.9999993 7.152558e-07 0
#>
#> $nrep
#> [1] 3
#>
AllelicMeta_est(c(5, 0, 0, 3, 5, 1, 23), c(1, 8, 6, 2, 0, 10, 0),
SE_prior = 0.2, F_inbr_prior = 0.1)
#> $allelefreq
#> [1] 0.4931604
#>
#> $SE
#> [1] 0.03413917
#>
#> $F_inbr
#> [1] 0.678992
#>
#> $genotypes
#> [1] "rr" "vv" "vv" "rv" "rr" "vv" "rr"
#>
#> $genoprobs
#> p.rr. p.rv. p.vv.
#> 1 8.253493e-01 1.746494e-01 1.330900e-06
#> 2 2.353486e-12 1.936346e-03 9.980637e-01
#> 3 1.873889e-09 7.187552e-03 9.928124e-01
#> 4 7.935723e-02 9.177449e-01 2.897870e-03
#> 5 9.857576e-01 1.424238e-02 5.618446e-08
#> 6 8.271749e-14 7.557152e-03 9.924428e-01
#> 7 9.999999e-01 1.029936e-07 4.226079e-34
#>
#> $nrep
#> [1] 20
#>