standard_alleles determines the reference and variant allele for a SNP position, which are (using total across-sample counts) simply the two most-occurring nucleotides that are also present in the given ref_alleles column of data_pos. In case of ties (more than two equally-abundant highest-count alleles or more than one equally-abundant second-highest-count alleles), which should be very rare, the final choice of reference- and/or variant allele is made at random among suitable candidates.

standard_alleles(data_pos)

Arguments

data_pos

Data frame. Data frame of a SNP position with columns: "chromosome", "position", "ref_alleles", "dbSNP_ref", "gene", "A", "T", "C", "G", "sample", "sample_nr". At least columns of allelic counts ("A", "T", "C", "G") and the dbSNP reference alleles ("ref_alleles") should be present. If no dbSNP reference alleles are available, "A/T/C/G" can be used as reference alleles.

Value

The data as data frame with standard alleles("ref_alleles", "ref" and "var"), as well as their respective counts ("ref_count" and "var_count").