Data and software to accompany the paper: Applying compressed sensing to genome-wide association studies.
Dataset type: Software
Data released on June 06, 2014
The aim of a genome-wide association study (GWAS) is to isolate DNA markers for variants affecting phenotypes of interest. Linear regression is employed for this purpose, and in recent years a signal-processing paradigm known as compressed sensing (CS) has coalesced around a particular class of regression techniques. CS is not a method in its own right, but rather a body of theory regarding signal recovery when the number of predictor variables (i.e., genotyped markers) exceeds the sample size. The paper shows the applicability of compressed sensing (CS) theory to genome-wide association studies (GWAS), where the purpose is to ﬁnd trait-associated tagging markers (genetic variants). Analysis scripts are contained in the compressed CS file. Mock data and scripts are found in the compressed GD file. The example scripts found in the CS repository require the GD files to be unpacked in a separate folder. Please look at accompanying readme pdfs for both repositories and annotations in the example scripts before using.
Read the peer-reviewed publication(s):
Vattikuti, S., Lee, J. J., Chang, C. C., Hsu, S. D. H., & Chow, C. C. (2014). Applying compressed sensing to genome-wide association studies. GigaScience, 3(1). doi:10.1186/2047-217x-3-10