This function reads the fastq file of an individual and clean it by removing:
unique reads with high coverage (likely paralogs or TE)
distinct reads with low coverage
clean_fq( fq.files, min.coverage.threshold = 2L, max.coverage.threshold = "high.coverage.unique.reads", remove.unique.reads = TRUE, write.blacklist = TRUE, write.blacklist.fasta = TRUE, compress = FALSE, output.dir = NULL, parallel.core = parallel::detectCores() - 1 )
fq.files | (character, path). The path to the individual fastq file to check.
Default: |
---|---|
min.coverage.threshold | (integer). Minimum coverage threshold.
The function will remove distinct reads with coverage <= to the threshold.
To turn off, |
max.coverage.threshold | (integer, character). Maximum coverage threshold.
The function will remove distinct reads with coverage >= than this threshold.
To turn off, |
remove.unique.reads | (logical). Remove distinct unique reads with high
coverage. Likely paralogs or Transposable elements.
Default: |
write.blacklist | (logical). Write the blacklisted reads to a file.
Default: |
write.blacklist.fasta | (logical). Write the blacklisted reads to a
fasta file.
Default: |
compress | (logical) To compress the output files. If you have the disk
space, don't compress, it's way faster this way to write.
Default: |
output.dir | (path) Write the cleaned fq files in a specific directory.
Default: |
parallel.core | (integer) Enable parallel execution with the number of threads.
Default: |
The function returns a cleaned fq file with the name of the sample and
-cleaned
appended to the filename.
coming soon, just try it in the meantime...