final_filter only retains significantly imprinted SNPs (after adjusting for multiple testing) and SNPs of interest (with suitable GOF and degree of (median) imprinting) over all chromosomes. Results and allelic count files are generated. When both file_all_counts and file_impr_counts are set to FALSE, this function can be used to simply filter the results_df input.

final_filter(
  data_hash,
  results_df,
  results_wd,
  gof_filt = 1.2,
  adj_p_filt = 0.05,
  med_impr_filt = 0.8,
  i_filt = 0.6,
  file_all = TRUE,
  file_impr = TRUE,
  file_all_counts = FALSE,
  file_impr_counts = TRUE
)

Arguments

data_hash

Hash. Hash of SNP positions with a data frame for every SNP position.

results_df

Data frame. Results data frame with columns: "position", "gene", "LRT", "p", "estimated.i", "allele.frequency", "dbSNP", "reference", "variant", "est_SE", "coverage", "nr_samples", "GOF", "symmetry", "med_impr", est_inbreeding", "tot_inbreeding".

results_wd

String. Directory where results files are written to.

gof_filt

Number. Minimal Goodness of Fit, which is the mean(log(sample likelihood under imprinted model * sample coverage + 1)) across samples of a locus. A good (and default) cutoff is 0.8.

adj_p_filt

Number. The FDR adjusted singnificance level filter (default is 0.05).

med_impr_filt

Number. Minimal median imprinting (default is 0.8).

i_filt

Number. Minimal degree of imprinting (default is 0.6).

file_all

Logical. Should a file with all SNP information (imprinted and non-imprinted SNPs) be made (default is TRUE).

file_impr

Logical. Should a file with imprinted SNP information be made (default is TRUE).

file_all_counts

Logical. Should a file with all SNP counts (imprinted and non-imprinted SNPs) be made (default is FALSE).

file_impr_counts

Logical. Should a file with imprinted SNP counts be made (default is TRUE).

Value

Data frame with results filtered on adjusted p-value, GOF, median imprinting and degree of imprinting.