Write a pcadapt file from a tidy data frame. The data is biallelic. Used internally in radiator and might be of interest for users.

write_pcadapt(
  data,
  pop.select = NULL,
  filename = NULL,
  parallel.core = parallel::detectCores() - 1,
  ...
)

Arguments

data

A tidy data frame object in the global environment or a tidy data frame in wide or long format in the working directory. How to get a tidy data frame ? Look into radiator tidy_genomic_data.

pop.select

(optional, string) Selected list of populations for the analysis. e.g. pop.select = c("QUE", "ONT") to select QUE and ONT population samples (out of 20 pops). Default: pop.select = NULL

filename

(optional) The file name prefix for the pcadapt file written to the working directory. With default: filename = NULL, the date and time is appended to radiator_pcadapt_.

parallel.core

(optional) The number of core used for parallel execution during import. Default: parallel.core = parallel::detectCores() - 1.

...

(optional) To pass further arguments for fine-tuning the function.

Value

A pcadapt file is written in the working directory a genotype matrix object is also generated in the global environment.

Details

Integrated filters:

  1. by defaults only markers found in common between populations are used (See advance section).

  2. by defaults monomorphic markers are automatically removed before generating the pcadapt file.

Advance mode

dots-dots-dots ... allows to pass several arguments for fine-tuning the function:

  1. Filtering for linkage disequilibrium: 3 arguments filter.long.ld, long.ld.missing, ld.method described in filter_ld are available. Reducing linkage before running genome scan is essential. At least start by removing SNPs on the same RADseq locus (short linkage disequilibrium).

  2. Filtering markers with low Minor Allele Count, Frequency or Depth Use the 2 arguments provided by the function filter_ma (read doc): filter.ma and ma.stats to evaluate the impact of MAC/MAF/MAD on genome scans. is called.

  3. Turning off the filter that keeps markers in common between strata: This is not recommended, but users who wants to explore the impact of such filtering and know the biais it can potentially generate can use the argument filter.common.markers. The function filter_common_markers is called. Default: filter.common.markers = NULL

References

Luu, K., Bazin, E., & Blum, M. G. (2017). pcadapt: an R package to perform genome scans for selection based on principal component analysis. Molecular Ecology Resources, 17(1), 67-77.

Duforet-Frebourg, N., Luu, K., Laval, G., Bazin, E., & Blum, M. G. (2015). Detecting genomic signatures of natural selection with principal component analysis: application to the 1000 Genomes data. Molecular biology and evolution, msv334.

Author

Thierry Gosselin thierrygosselin@icloud.com