Supporting data for "Atria: An Ultra-fast and Accurate Trimmer for Adapter and Quality Trimming"

Dataset type: Genomic, Software, Transcriptomic, Bioinformatics
Data released on September 29, 2021

Chuan J; Zhou A; Hale LR; He M; Li X (2021): Supporting data for "Atria: An Ultra-fast and Accurate Trimmer for Adapter and Quality Trimming" GigaScience Database.


As Next Generation Sequencing takes a dominant role in terms of output capacity and sequence length, adapters attached to the reads and low-quality bases hinder the performance of downstream analysis directly and implicitly, such as producing false-positive single nucleotide polymorphisms (SNP), and generating fragmented assemblies. A fast trimming algorithm is in demand to remove adapters precisely, especially in read tails with relatively low quality.
We present a trimming program named Atria. Atria matches the adapters in paired reads and finds possible overlapped regions with a super-fast and carefully designed byte-based matching algorithm (O(n) time with O(1) space). Atria also implements multi-threading in both sequence processing and file compression and supports single-end reads.
Atria performs favorably in various trimming and runtime benchmarks of both simulated and real data with other cutting-edge trimmers. We also provide an ultra-fast and lightweight byte-based matching algorithm. The algorithm can be used in a broad range of short-sequence matching applications, such as primer search and seed scanning before alignment.
The Atria executables, source code, and benchmark scripts are available at under the MIT license.

Additional details

Additional information:

Github links:

Accessions (data referenced by this study):

SRA: SRR330569
SRA: ERR4695159

File NameSample IDData TypeFile FormatSizeRelease Date 
GitHub archivezip2.47 MB2021-09-26
HTMLHTML3.54 MB2021-09-26
HTMLHTML3.51 MB2021-09-26
Tabular DataCSV1.73 KB2021-09-26
Tabular DataCSV655.87 KB2021-09-26
Tabular DataCSV1.88 KB2021-09-26
Tabular DataCSV1.77 KB2021-09-26
readmeTEXT4.88 KB2021-09-26
Tabular DataCSV0.54 KB2021-09-26
Tabular DataCSV2.61 KB2021-09-26
Displaying 1-10 of 12 File(s).
Funding body Awardee Award ID Comments
Government of Canada X Li Genomics Research and Development Initiative
Date Action
September 29, 2021 Dataset publish
September 29, 2021 Funder updated : Government of Canada