This function reads the fastq file of an individual and clean it by removing:

  • unique reads with high coverage (likely paralogs or TE)

  • distinct reads with low coverage

clean_fq(
  fq.files,
  min.coverage.threshold = 2L,
  max.coverage.threshold = "high.coverage.unique.reads",
  remove.unique.reads = TRUE,
  write.blacklist = TRUE,
  write.blacklist.fasta = TRUE,
  compress = FALSE,
  output.dir = NULL,
  parallel.core = parallel::detectCores() - 1
)

Arguments

fq.files

(character, path). The path to the individual fastq file to check. Default: fq.files = "my-sample.fq.gz".

min.coverage.threshold

(integer). Minimum coverage threshold. The function will remove distinct reads with coverage <= to the threshold. To turn off, min.coverage.threshold = NULL or 0L. Default: min.coverage.threshold = 2L.

max.coverage.threshold

(integer, character). Maximum coverage threshold. The function will remove distinct reads with coverage >= than this threshold. To turn off, max.coverage.threshold = NULL. The default, use the starting depth where high coverage unique reads are observed. Default: max.coverage.threshold = "high.coverage.unique.reads".

remove.unique.reads

(logical). Remove distinct unique reads with high coverage. Likely paralogs or Transposable elements. Default: remove.unique.reads = TRUE.

write.blacklist

(logical). Write the blacklisted reads to a file. Default: write.blacklist = FALSE.

write.blacklist.fasta

(logical). Write the blacklisted reads to a fasta file. Default: write.blacklist.fasta = FALSE.

compress

(logical) To compress the output files. If you have the disk space, don't compress, it's way faster this way to write. Default: compress = FALSE.

output.dir

(path) Write the cleaned fq files in a specific directory. Default: output.dir = NULL, uses the working directory.

parallel.core

(integer) Enable parallel execution with the number of threads. Default: parallel.core = parallel::detectCores() - 1.

Value

The function returns a cleaned fq file with the name of the sample and -cleaned appended to the filename.

Details

coming soon, just try it in the meantime...

Examples