Supporting data for "AMR-meta: ak-mer and metafeature approach to classify antimicrobial resistance from high-throughput short-read metagenomics data"

Dataset type: Metagenomic, Software
Data released on February 17, 2022

Marini S; Oliva M; Slizovskiy I; Das RA; Noyes NR; Kahveci T; Boucher C; Prosperi M (2022): Supporting data for "AMR-meta: ak-mer and metafeature approach to classify antimicrobial resistance from high-throughput short-read metagenomics data" GigaScience Database. http://dx.doi.org/10.5524/102197

DOI10.5524/102197

Antimicrobial resistance (AMR) is a global health concern. High-throughput metagenomic sequencing of microbial samples enables profiling of AMR genes through comparison with curated AMR databases. However, performance of current methods are often hampered by database incompleteness, and presence of homology/homoplasy with other non-AMR genes in sequenced samples.
We present AMR-meta, a database-free and alignment-free approach, based on k-mers, which combines algebraic matrix factorization into metafeatures with regularized regression. Metafeatures capture multi-level gene diversity across main antibiotic classes. AMR-meta takes in reads from metagenomic shotgun sequencing and outputs predictions about whether those reads contribute to resistance against specific classes of antibiotics. In addition, AMR-meta employs an augmented training strategy that joins an AMR gene database with non-AMR genes (used as negative examples). We compare AMR-meta with AMRPlusPlus, DeepARG, and Meta-MARC, further testing their ensemble via a voting system. In cross-validation, AMR-meta has a median (interquartile) f-score of 0.7 (0.2-0.9). On semi-synthetic metagenomic data –external test– on average AMR-meta yields a 1.3-fold hit rate increase over existing methods. In terms of run-time, AMR-meta is 3x faster than DeepARG and 30x faster than Meta-MARC, and as fast as AMRPlusPlus. Finally, we note that differences in AMR ontologies and observed variance of all tools in classification outputs call for further development on standardization of benchmarking data and protocols.
AMR-meta is a fast, accurate classifier that exploits non-AMR negative sets to improve sensitivity and specificity. The differences in AMR ontologies and the high variance of all tools in classification outputs call for the deployment of standard benchmarking data and protocols, to fairly compare AMR prediction tools.

Additional details

Read the peer-reviewed publication(s):

(PubMed: 35583675)

Github links:

https://github.com/smarini/AMR-meta





File NameSample IDData TypeFile FormatSizeRelease Date 
GitHub archivezip187.44 MB2022-02-03
Singularity containerUNKNOWN796.64 MB2022-02-03
Tabular dataCSV2.06 KB2022-03-01
Tabular dataCSV2.84 KB2022-03-01
Tabular dataCSV3.25 KB2022-03-01
Tabular dataCSV5.91 KB2022-03-01
ReadmeTEXT4.51 KB2022-02-03
Tabular dataCSV2.89 KB2022-03-01
Tabular dataCSV0.26 KB2021-03-01
Displaying 1-9 of 9 File(s).
Funding body Awardee Award ID Comments
National Institutes of Health C Boucher R01AI141810
National Science Foundation M Prosperi 2013998
United States Department of Agriculture N R Noyes 2019-67017-29110
Date Action
February 17, 2022 Dataset publish
March 7, 2022 Manuscript Link added : 10.1093/gigascience/giac029
October 7, 2022 Manuscript Link updated : 10.1093/gigascience/giac029