Additional file 14: of Metagenomic characterization of ambulances across the USA

Figure S20. Boxplots of classifier performance over model specific parameter sweeps during training (80/20 split) on overlap data for city class. Classes underwent up sampling and were optimized in terms of mean ROC score. Shown are kappa and balanced accuracy, averaged over classes. rf, random forest; gbm, stochastic gradient boosting; rrf, regularized random forest; c50, c5.0 decision tree, pls, partial least squares; en, elastic net; knn, k-nearest neighbors; svm linear, support vector machine with linear kernel; rbf svm, support vector machine with rbf kernel. (DOCX 97 kb)