Detect the number of alleles/nucleotides per markers. Sometimes, a dataset might have 3 alleles at a SNP, is this biological or artifactual ? This function helps to resolve this, by highlighting markers with this potential problem, so that user can further look at the origin of the phenomenon. The function can also split datasets in biallelic/multiallelic datasets.

detect_biallelic_problems(
  data,
  verbose = TRUE,
  parallel.core = parallel::detectCores() - 1
)

Arguments

data

A tidy data frame object in the global environment or a tidy data frame in wide or long format in the working directory. The tidy dataset needs a column with nucleotide information GT_VCF_NUC, usually this is automatically generated by radiator. How to get a tidy data frame ? Look into radiator tidy_genomic_data.

verbose

(optional, logical) verbose = TRUE to be chatty during execution. Default: verbose = TRUE.

parallel.core

(optional) The number of core used for parallel execution. Default: parallel.core = parallel::detectCores() - 1.

Value

Several info

Author

Thierry Gosselin thierrygosselin@icloud.com