Help Login Create account

Data released on December 20, 2017

Supporting data for "Accurate Prediction of Personalized Olfactory Perception from Large-Scale Chemoinformatic Features"

Li, H; Panwar, B; Omenn, G, S; Guan, Y (2017): Supporting data for "Accurate Prediction of Personalized Olfactory Perception from Large-Scale Chemoinformatic Features" GigaScience Database. http://dx.doi.org/10.5524/100384 RIS BibTeX Text

The olfactory stimulus-percept problem has been studied for more than a century, yet it is still hard to precisely predict the odor given the large-scale chemoinformatic features of an odorant molecule. A major challenge is that the perceived qualities vary greatly among individuals due to different genetic and cultural backgrounds. Moreover, the combinatorial interactions between multiple odorant receptors and diverse molecules significantly complicate the olfaction prediction. Many attempts have been made to establish structure-odor relationships for intensity and pleasantness, but no models are available to predict the personalized multi-odor attributes of molecules. In this study, we describe our winning algorithm for predicting individual and population perceptual responses to various odorants in DREAM Olfaction Prediction Challenge.
We find that random forest model consisting of multiple decision trees is well-suited to this prediction problem, given the large feature spaces and high variability of perceptual ratings among individuals. Integrating both population and individual perceptions into our model effectively reduces the influence of noise and outliers. By analyzing the importance of each chemical feature, we find that a small set of low- and non-degenerative features is sufficient for accurate prediction.
Our random forest model successfully predicts personalized odor attributes of structurally diverse molecules. This model together with the top discriminative features has the potential to extend our understanding of olfactory perception mechanisms and provide an alternative for rational odorant design.

Contact Submitter

Related manuscripts:

doi:10.1093/gigascience/gix127

Additional information:

https://github.com/Hongyang449/olfaction_prediction_manuscript

https://www.synapse.org/#!Synapse:syn3098005

Keywords:

olfactory perception structure-odor relationships random forest chemoinformatics 

Software

http://gigadb.org/images/data/cropped/100384.png

Funding:

  • Funding body - National Science Foundation
  • Award ID - 1452656
  • Awardee - Y Guan
  • Funding body - Alzheimer’s Association
  • Award ID - BAND-15-367116
  • Awardee - Y Guan

Samples: Table Settings

Columns:

Common Name
Scienfic Name
Sample Attributes
Taxonomic ID
Genbank Name

Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
14328-2  non-biological sample Description:3,4-Dimethoxyacetophenone
Relevant electronic resources:pubchem.ncbi.nlm.nih...
Sample source:Sigma-Aldrich
...
+
14491-2  non-biological sample Description:1,6-Hexanedithiol
Relevant electronic resources:pubchem.ncbi.nlm.nih...
Sample source:Sigma-Aldrich
...
+
14514-2  non-biological sample Description:2-Acetyl-5-methylfuran
Relevant electronic resources:pubchem.ncbi.nlm.nih...
Sample source:Sigma-Aldrich
...
+
14525-2  non-biological sample Description:l-fenchone
Relevant electronic resources:pubchem.ncbi.nlm.nih...
Sample source:Sigma-Aldrich
...
+
145742-2  non-biological sample Description:L-Proline
Relevant electronic resources:pubchem.ncbi.nlm.nih...
Sample source:Sigma-Aldrich
...
+
15037-2  non-biological sample Description:1-Furfurylpyrrole
Relevant electronic resources:pubchem.ncbi.nlm.nih...
Sample source:Sigma-Aldrich
...
+
15380-2  non-biological sample Description:Bis(methylthio)methane
Relevant electronic resources:pubchem.ncbi.nlm.nih...
Sample source:Sigma-Aldrich
...
+
1549025-2  non-biological sample Description:Neryl acetate
Relevant electronic resources:pubchem.ncbi.nlm.nih...
Sample source:Sigma-Aldrich
...
+
1549026-2  non-biological sample Description:geranyl acetate
Relevant electronic resources:pubchem.ncbi.nlm.nih...
Sample source:Sigma-Aldrich
...
+
1549778-2  non-biological sample Description:geranylacetone
Relevant electronic resources:pubchem.ncbi.nlm.nih...
Sample source:NA
...
+
Displaying 71-80 of 476 Sample(s).

Files: (FTP site) Table Settings

Columns:

File Description
Sample ID
File Type
File Format
Size
Release Date
Download Link
File Attributes

File NameSample IDFile TypeFile FormatSizeRelease Date 
Tabular DataCSV1.55 MB2017-12-01
Tabular DataCSV554.4 KB2017-12-01
TextTEXT403.77 KB2017-12-01
mixed archivearchive92.54 MB2017-12-01
mixed archivearchive7.61 MB2017-12-01
ReadmeTEXT2.97 KB2017-12-01
mixed archivearchive15.77 MB2017-12-01
Tabular DataCSV1.84 KB2017-12-01
Displaying 1-8 of 8 File(s).

History:

+

Other datasets you might like: