Nucleotide diversity — pi • radiator

Calculates the nucleotide diversity (Nei & Li, 1979).

To get an estimate with the consensus reads, use the function summary_haplotypes found in the package [stackr](https://github.com/thierrygosselin/stackr). The estimate in summary_haplotypes integrates the consensus markers found in [STACKS](http://catchenlab.life.illinois.edu/stacks/) populations.haplotypes.tsv file. Both radiator and stackr functions requires stringdist package.

The read.length argument below is used directly in the calculations. To be correctly estimated, the reads obviously need to be of identical size...

pi(
  data,
  read.length = NULL,
  parallel.core = parallel::detectCores() - 1,
  path.folder = NULL,
  verbose = TRUE
)

Arguments

data

(4 options) A file or object generated by radiator:

tidy data
Genomic Data Structure (GDS)

How to get GDS and tidy data ? Look into tidy_genomic_data, read_vcf or tidy_vcf.

read.length

(integer, optional) The length in nucleotide of your reads. By default it is estimated from the data using the column COL. Default: read.length = NULL.

parallel.core

(optional) The number of core used for parallel execution during import. Default: parallel.core = parallel::detectCores() - 1.

path.folder

(path, optional) By default will print results in the working directory. Default: path.folder = NULL.

verbose

(optional, logical) When verbose = TRUE the function is a little more chatty during execution. Default: verbose = TRUE.

Value

The function returns a list with the function call and:

$pi.individuals: the pi estimated for each individual
$pi.populations: the pi statistics estimated per populations and overall.
$boxplot.pi: showing the boxplot of Pi for each populations and overall.

use $ to access each #' objects in the list.

Note

Thanks to Anne-Laure Ferchaud for very useful comments on previous version of this function.

References

Nei M, Li WH (1979) Mathematical model for studying genetic variation in terms of restriction endonucleases. Proceedings of the National Academy of Sciences of the United States of America, 76, 5269–5273.

Author

Thierry Gosselin thierrygosselin@icloud.com

Examples

if (FALSE) { # \dontrun{
require(stringdist)
# The simplest way to run the function:
pi.sum <- radiator::pi(data = "brook.charr.gds")
} # }