Used internally in radiator and might be of interest for users. The function generate a tidy dataset of DArT markers and associated metadata. Usefull to filter before importing the actual dataset.

tidy_dart_metadata(
  data,
  filename = NULL,
  verbose = FALSE,
  parallel.core = parallel::detectCores() - 1
)

Arguments

data

DArT output file. Note that most popular formats used by DArT are recognised (1- and 2- row format, also called binary, and count data.). If you encounter a problem, sent me your data so that I can update the function. The function can import .csv or .tsv files.

filename

(optional) The function uses write.fst, to write the tidy data frame in the working directory. The file extension appended to the filename provided is .rad. With default: filename = NULL, the tidy data frame is in the global environment only (i.e. not written in the working directory...).

verbose

(optional, logical) When verbose = TRUE the function is a little more chatty during execution. Default: verbose = TRUE.

parallel.core

(optional) The number of core used for parallel execution during import. Default: parallel.core = parallel::detectCores() - 1.

Value

A tidy dataframe with these columns:

  1. MARKERS: generated by radiator and correspond to CHROM + LOCUS + POS separated by 2 underscores.

  2. CHROM: the chromosome, for de novo: CHROM_1.

  3. LOCUS: the locus.

  4. POS: the SNP id on the LOCUS.

  5. REF: the reference allele.

  6. ALT: the alternate allele.

  7. CALL_RATE: call rate output specific of DArT.

  8. AVG_COUNT_REF: the coverage for the reference allele, output specific of DArT.

  9. AVG_COUNT_SNP: the coverage for the alternate allele, output specific of DArT.

  10. REP_AVG: the reproducibility average, output specific of DArT.

Author

Thierry Gosselin thierrygosselin@icloud.com

Examples

if (FALSE) {
clownfish.dart.tidy <- radiator::tidy_dart_metadata(
data = "clownfish.dart.tsv",
verbose = TRUE)
}