Transform bi-allelic PLINK files in .tped or .bed formats into a tidy dataset.

Used internally in radiator and assigner and might be of interest for users.

tidy_plink(
  data,
  parallel.core = parallel::detectCores() - 1,
  verbose = FALSE,
  ...
)

Arguments

data

The PLINK file.

  • bi-allelic data only. For haplotypes use VCF.

  • tped file format: the corresponding tfam file must be in the directory.

  • bed file format: IS THE PREFERRED format, the corresponding fam and bim files must be in the directory.

parallel.core

(optional) The number of core used for parallel execution during import. Default: parallel.core = parallel::detectCores() - 1.

verbose

(optional, logical) When verbose = TRUE the function is a little more chatty during execution. Default: verbose = TRUE.

...

(optional) Advance mode that allows to pass further arguments for fine-tuning the function. Also used for legacy arguments (see details or special section)

Value

A tidy tibble of the PLINK file.

Advance mode

dots-dots-dots ... allows to pass several arguments for fine-tuning the function:

  1. calibrate.alleles: logical. For tped files, if calibrate.alleles = FALSE the function runs faster but REF/ALT alleles may not be calibrated. The default assumes the users or sotware producing the PLINK file calibrated the alleles. Default: calibrate.alleles = FALSE.

References

Zheng X, Gogarten S, Lawrence M, Stilp A, Conomos M, Weir BS, Laurie C, Levine D (2017). SeqArray -- A storage-efficient high-performance data format for WGS variant calls. Bioinformatics.

PLINK: a tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics. 2007: 81: 559–575. doi:10.1086/519795

See also

Author

Thierry Gosselin thierrygosselin@icloud.com

Examples

if (FALSE) {
data <- radiator::tidy_plink(data = "my_plink_file.bed", verbose = TRUE)


# when conversion is required from TPED to BED, in Terminal:
# plink --tfile my_plink_file --make-bed --allow-no-sex --allow-extra-chr --chr-set 95
}