BetaBinomGenotyping is a wrapper function of EMfit_betabinom_robust allowing multiple loci to be given as input (as lists). Besides calling the latter to perform a EM-fit and update the inputted per-locus dataframes as listed in EMfit_betabinom_robust output, it also returns a results-dataframe with one row for every locus, containing genotyping- and Allelic Bias-detection results. This function is mainly useful when following maelstRom's vignette step-by-step, as its input requirement are rather stictly dependent on performing all previous steps in that pipeline. See the help page of EMfit_betabinom_robust for more information on these outputs

BetaBinomGenotyping(
  DataList,
  allelefreq = 0.5,
  SE,
  inbr = 0,
  dltaco = 10^-6,
  HWE = FALSE,
  p_InitEst = FALSE,
  ThetaInits = "moment",
  ReEstThetas = "moment",
  NoSplitHom = TRUE,
  NoSplitHet = TRUE,
  ResetThetaMin = 10^-10,
  ResetThetaMax = 10^-1,
  DistRob = "Cook",
  CookMargin = 5,
  LikEmpNum = 1000,
  LikMargin = 0,
  NumHetMin = 5,
  MaxOutFrac = 0.5,
  thetaTRY = c(10^-1, 10^-3, 10^-7),
  fitH0 = TRUE
)

Arguments

DataList

List of dataframes. Each dataframes should at least contain the columns "ref", ref_count", "var", "var_count" and "est_SE", which should be the case if following the vignette up to this function's appearance.

allelefreq, SE, inbr, dltaco, HWE, p_InitEst, ThetaInits, ReEstThetas, NoSplitHom, NoSplitHet, ResetThetaMin, ResetThetaMax, DistRob, CookMargin, LikEmpNum, LikMargin, NumHetMin, MaxOutFrac, thetaTRY, fitH0

All remaining parameters of EMfit_betabinom_robust

Value

A list containing the following components:

DataList_out

The updated DataList, corresponding to the data_hash output of EMfit_betabinom_robust

Geno_AB_res

A dataframe containing per-locus results of EMfit_betabinom_robust EM-fit as well as some other metrics, namely:

  • positionThe locus' name, according to names(DataList)

  • probshiftFitted reference allele fraction in RNAseq reads, indicating allelic bias when different from 0.5

  • LRTThe likelihood ratio test statistic, testing for significant allelic bias

  • pThe likelihood ratio test p-value, testing for significant allelic bias

  • qualityEquals "!" if the sample contains no fitted heterozygotes, otherwise ""

  • allele_frequencyestimated reference allele frequency in the population

  • referencereference allele nucleotide

  • variantvariant allele nucleotide

  • est_SEper-locus sequencing error rate estimate, as outputted earlier by AllelicMeta_est

  • coveragemedian coverage across samples

  • nr_samplesnumber of samples covering this locus with at least one reference- or variant-count

  • median_ABmedian allelic bias, as outputted by median_AB

  • rho_rrfitted reference homozygous fraction in the population

  • rho_rvfitted heterozygous fraction in the population

  • rho_vvfitted variant homozygous fraction in the population

  • theta_homfitted overdispersion parameter for the homozygous PMFs

  • theta_hetfitted overdispersion parameter for the heterozygous PMF

  • theta_hom_NoShiftfitted overdispersion parameter for the homozygous PMFs assuming no allelic bias

  • theta_het_NoShiftfitted overdispersion parameter for the heterozygous PMF assuming no allelic bias

  • Chi2PVALp-value of a chi square test assessing Hardy-Weinberg-Equilibrium on the locus given the inbreeding coefficient metaparameter; see HWE_chisquared

  • Chi2STATtest statistic of a chi square test assessing Hardy-Weinberg-Equilibrium on the locus given the inbreeding coefficient metaparameter; see HWE_chisquared