Supporting data for "Atria: An Ultra-fast and Accurate Trimmer for Adapter and Quality Trimming"

Dataset type: Genomic, Software, Transcriptomic, Bioinformatics
Data released on September 29, 2021

Chuan J; Zhou A; Hale LR; He M; Li X (2021): Supporting data for "Atria: An Ultra-fast and Accurate Trimmer for Adapter and Quality Trimming" GigaScience Database. http://dx.doi.org/10.5524/100935

DOI10.5524/100935

As Next Generation Sequencing takes a dominant role in terms of output capacity and sequence length, adapters attached to the reads and low-quality bases hinder the performance of downstream analysis directly and implicitly, such as producing false-positive single nucleotide polymorphisms (SNP), and generating fragmented assemblies. A fast trimming algorithm is in demand to remove adapters precisely, especially in read tails with relatively low quality.
We present a trimming program named Atria. Atria matches the adapters in paired reads and finds possible overlapped regions with a super-fast and carefully designed byte-based matching algorithm (O(n) time with O(1) space). Atria also implements multi-threading in both sequence processing and file compression and supports single-end reads.
Atria performs favorably in various trimming and runtime benchmarks of both simulated and real data with other cutting-edge trimmers. We also provide an ultra-fast and lightweight byte-based matching algorithm. The algorithm can be used in a broad range of short-sequence matching applications, such as primer search and seed scanning before alignment.
The Atria executables, source code, and benchmark scripts are available at https://github.com/cihga39871/Atria under the MIT license.

Additional details

Additional information:

https://scicrunch.org/resolver/RRID:SCR_021313

Github links:

https://github.com/cihga39871/Atria

Accessions (data referenced by this study):

SRA: SRR330569
SRA: ERR4695159





File NameSample IDData TypeFile FormatSizeRelease Date 
GitHub archivezip2.47 MB2021-09-26
HTMLHTML3.54 MB2021-09-26
HTMLHTML3.51 MB2021-09-26
Tabular DataCSV1.73 KB2021-09-26
Tabular DataCSV655.87 KB2021-09-26
Tabular DataCSV1.88 KB2021-09-26
Tabular DataCSV1.77 KB2021-09-26
readmeTEXT4.88 KB2021-09-26
Tabular DataCSV0.54 KB2021-09-26
Tabular DataCSV2.61 KB2021-09-26
Displaying 1-10 of 12 File(s).
Funding body Awardee Award ID Comments
Government of Canada X Li Genomics Research and Development Initiative
Date Action
September 29, 2021 Dataset publish
September 29, 2021 Funder updated : Government of Canada