Additional file 1: of Understanding the importance of key risk factors in predicting chronic bronchitic symptoms using a machine learning approach

Deng, Huiyu; Urman, Robert; Gilliland, Frank; Eckel, Sandrah

doi:10.6084/m9.figshare.7927781.v1

12874_2019_708_MOESM1_ESM.docx (3.31 MB)

Additional file 1: of Understanding the importance of key risk factors in predicting chronic bronchitic symptoms using a machine learning approach

journal contribution

posted on 2019-03-29, 05:00 authored by Huiyu Deng, Robert Urman, Frank Gilliland, Sandrah Eckel

Table S1. Comparison of gradient boosting models fit for all participants and all predictors, for 50 different random training sets. Table S2. Accuracy, sensitivity, and specificity of models fit separately with groups of risk factors for all participants, asthmatics, and non-asthmatics, for 50 different random holdout test datasets. Table S3. Average AUC of models trained on various groups of risk factors using data from all participants and validated separately by asthma status, for 50 random training sets. Table S4. Average area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity of models fit separately with groups of risk factors for non-asthmatics, non-asthmatics (rhinitis), and non-asthmatics (no rhinitis), for 50 different across- and within- participants holdout test datasets. Table S5. Comparison of gradient boosting models vs. logistic regression for all participants, asthmatics, and non-asthmatics averaged across 50 training sets. Table S6. Logistic regression results for all participants, asthmatics, and non-asthmatics for a random training set. Figure S1. Boxplot of relative influence, for 50 different random training sets, of the top 10 risk factors in models fit using all predictor variables for non-asthmatics, non-asthmatics (rhinitis), and non-asthmatics (no rhinitis). Figure S2. Area under the receiver operating characteristic curve (AUC) of the gradient boosting models and logistic regression model models fit separately with all risk factors and top 10 most important risk factors for 50 different random across-participant holdout test datasets. Figure S3. Area under the receiver operating characteristic curve (AUC) of the gradient boosting models and logistic regression models fit separately with all risk factors and top 10 most important risk factors for 50 different random within-participant holdout test datasets. (DOCX 3711 kb)

Funding

National Institutes of Health

History

Usage metrics

Keywords

Bronchitic symptoms Air pollution Machine learning Gradient boosting model Prediction model

Licence

CC BY + CC0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Additional file 1: of Understanding the importance of key risk factors in predicting chronic bronchitic symptoms using a machine learning approach

Funding

National Institutes of Health

History

Usage metrics

Categories

Keywords

Licence

Exports