Generate a figure with the read depth groups — read_depth

This function reads the fastq file of an individual and generate a figure of read coverage groups.

read_depth_plot(
  fq.file,
  min.coverage.fig = 7L,
  output = "08_stacks_results/02_read_depth_plot",
  parallel.core = parallel::detectCores() - 1
)

Arguments

fq.file: (character, path). The path to the individual fastq file to check. Default: fq.file = "my-sample.fq.gz".
min.coverage.fig: (integer). Minimum coverage used to draw the color on the figure. Default: min.coverage.fig = 7L.
output: (character, path) Where the figure will be saved. Default: "08_stacks_results/02_read_depth_plot".
parallel.core: (integer) Enable parallel execution with the number of threads. Default: parallel.core = parallel::detectCores() - 1.

Value

The function returns the read depth groups plot.

Details

4 read coverage groups are shown:

distinct reads with low coverage (in red): these reads are likely sequencing errors or uninformative polymorphisms (shared only by a few samples).
disting reads for a target coverage (in green):
- Usually represent around 80
- It’s a safe coverage range to start exploring your data (open for discussion).
- Lower threshold (default = 7): you can’t escape it, it’s your tolerance to call heterozygote a true heterozygote. You want a minimum coverage for both the reference and the alternative allele. Yes, you can use population information to lower this threshold or use some fancy bayesian algorithm.
- Higher threshold: is a lot more open for discussion, here it’s the lower limit of another group (the orange, see below for description). Minus 1 bp.
distinct reads with high coverage > 1 read depth (in yellow): those are legitimate alleles with high coverage.
distinct and unique reads with high coverage (in orange): those repetitive elements when assembled in locus are usually paralogs, retrotransposons, transposable elements, etc.

References

Ilut, D., Nydam, M., Hare, M. (2014). Defining Loci in Restriction-Based Reduced Representation Genomic Data from Non model Species: Sources of Bias and Diagnostics for Optimal Clustering BioMed Research International 2014. https://dx.doi.org/10.1155/2014/675158

Examples

if (FALSE) { # \dontrun{
require(vroom)
check.reads.depth.groups <- read_depth_plot(fq.file = "my-sample.fq.gz")
} # }