Calculates the nucleotide diversity (Nei & Li, 1979).
To get an estimate with the consensus reads, use the
function summary_haplotypes found in the package
[stackr](https://github.com/thierrygosselin/stackr). The estimate in
summary_haplotypes integrates the consensus markers found in
[STACKS](http://catchenlab.life.illinois.edu/stacks/)
populations.haplotypes.tsv file.
Both radiator and stackr functions requires stringdist
package.
The read.length
argument below is used directly in the calculations.
To be correctly estimated, the reads obviously need to be of identical size...
pi(
data,
read.length = NULL,
parallel.core = parallel::detectCores() - 1,
path.folder = NULL,
verbose = TRUE
)
(4 options) A file or object generated by radiator:
tidy data
Genomic Data Structure (GDS)
How to get GDS and tidy data ?
Look into tidy_genomic_data
,
read_vcf
or
tidy_vcf
.
(integer, optional) The length in nucleotide of your reads.
By default it is estimated from the data using the column COL
.
Default: read.length = NULL
.
(optional) The number of core used for parallel
execution during import.
Default: parallel.core = parallel::detectCores() - 1
.
(path, optional) By default will print results in the working directory.
Default: path.folder = NULL
.
(optional, logical) When verbose = TRUE
the function is a little more chatty during execution.
Default: verbose = TRUE
.
The function returns a list with the function call and:
$pi.individuals: the pi estimated for each individual
$pi.populations: the pi statistics estimated per populations and overall.
$boxplot.pi: showing the boxplot of Pi for each populations and overall.
use $ to access each #' objects in the list.
Thanks to Anne-Laure Ferchaud for very useful comments on previous version of this function.
Nei M, Li WH (1979) Mathematical model for studying genetic variation in terms of restriction endonucleases. Proceedings of the National Academy of Sciences of the United States of America, 76, 5269–5273.
if (FALSE) { # \dontrun{
require(stringdist)
# The simplest way to run the function:
pi.sum <- radiator::pi(data = "brook.charr.gds")
} # }