Write a dadi SNP input file from a tidy data frame.

This function will generate a dadi SNP input file using a radiator tidy tibble.

Note that missing data can potentially bias demographic inference, consequently you might want to check the impact of missing data by running two datset in dadi: one with the missing data or several with varying thresholds of missingness, and one with imputed genotypes (check my other package grur for this).

write_dadi(
  data,
  fasta.ingroup = NULL,
  fasta.outgroup = NULL,
  sumstats.ingroup = NULL,
  sumstats.outgroup = NULL,
  dadi.input.filename = NULL,
  calibrate.alleles = FALSE
)

Arguments

data: A tidy data frame object in the global environment or a tidy data frame in wide or long format in the working directory. How to get a tidy data frame ? Look into radiator tidy_genomic_data.
fasta.ingroup: (optional) The fasta file, sequences for the ingroup. Leave empty if no outgroup. Default: fasta.ingroup = NULL.
fasta.outgroup: (optional) The fasta file, sequences for the outgroup. Default: fasta.outgroup = NULL.
sumstats.ingroup: (optional) The sumstats output file from STACKS when running STACKS for the ingroup fasta file.This file is required to use with an outgroup. Leave empty if no outgroup. Default: sumstats.ingroup = NULL.
sumstats.outgroup: (optional) The sumstats output file from STACKS when running STACKS for the outgroup fasta file. This file is required to use an outgroup. Default: sumstats.outgroup = NULL.
dadi.input.filename: (optional) Name of the dadi SNP input file written to the working directory. e.g. dadi.file.tsv. Default use date and time to make the file. If used, the file extension need to finish with .tsv or .txt. Default: dadi.input.filename = NULL.
calibrate.alleles: (optional, logical) To re-calibrate REF an ALT alleles. Will be done automatically to the dataset if the required genomic format is not found. Please use if you have removed individuals. Default: calibrate.alleles = FALSE.

Value

The function returns tibble and the dadi input file in the working directory.

References

Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD (2009) Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data (G McVean, Ed,). PLoS genetics, 5, e1000695.

Author

Thierry Gosselin thierrygosselin@icloud.com

Examples

if (FALSE) { # \dontrun{
# without outgroup:
dadi.data <- radiator::write_dadi(data = "my_tidy_dataset.rad")

# with outgroup and fasta generated by stacks:
dadi.data <- radiator::write_dadi(
   data = "my_tidy_dataset.rad",
   fasta.ingroup = "batch_1.ingroup.fa",
   fasta.outgroup = "batch_1.outgroup.fa",
   sumstats.ingroup = "batch_1.sumstats.ingroup.tsv",
   sumstats.outgroup = "batch_1.sumstats.outgroup.tsv"
)
} # }