Removes the noise of an individual fastq file

This function reads the fastq file of an individual and clean it by removing:

unique reads with high coverage (likely paralogs or TE)
distinct reads with low coverage

clean_fq(
  fq.files,
  paired.end = FALSE,
  min.coverage.threshold = 2L,
  max.coverage.threshold = "high.coverage.unique.reads",
  remove.unique.reads = TRUE,
  write.blacklist = TRUE,
  write.blacklist.fasta = TRUE,
  compress = FALSE,
  output = "08_stacks_results/03_cleaned_fq",
  parallel.core = parallel::detectCores() - 1
)

Arguments

fq.files: (character, path). The path to the individual fastq file to check. Default: fq.files = "my-sample.fq.gz".
paired.end: (logical) Are the files paired-end. Default: paired.end = FALSE.
min.coverage.threshold: (integer). Minimum coverage threshold. The function will remove distinct reads with coverage <= to the threshold. To turn off, min.coverage.threshold = NULL or 0L. Default: min.coverage.threshold = 2L.
max.coverage.threshold: (integer, character). Maximum coverage threshold. The function will remove distinct reads with coverage >= than this threshold. To turn off, max.coverage.threshold = NULL. The default, use the starting depth where high coverage unique reads are observed. Default: max.coverage.threshold = "high.coverage.unique.reads".
remove.unique.reads: (logical). Remove distinct unique reads with high coverage. Likely paralogs or Transposable elements. Default: remove.unique.reads = TRUE.
write.blacklist: (logical). Write the blacklisted reads to a file. Default: write.blacklist = TRUE.
write.blacklist.fasta: (logical). Write the blacklisted reads to a fasta file. Default: write.blacklist.fasta = TRUE.
compress: (logical) To compress the output files. If you have the disk space, don't compress, it's way faster this way to write. Default: compress = FALSE.
output: (character, path) Write the cleaned fq files in a specific directory. Default: output = "08_stacks_results/03_cleaned_fq".
parallel.core: (integer) Enable parallel execution with the number of threads. Default: parallel.core = parallel::detectCores() - 1.

Value

The function returns a cleaned fq file with the name of the sample and -C appended to the filename.

Details

coming soon, just try it in the meantime...

Removes the noise of an individual fastq file

Arguments

Value

Details

Examples