Code for Fast and Scalable Implementation of the Bayesian SVM
dataset
posted on 2017-12-29, 18:21 authored by Florian Wenzel, Théo Galy-Fajou, Matthäus Deutsch, Marius KloftThis dataset contains the Julia code package for the Bayesian SVM algorithm described in the ECML PKDD 2017 paper; Wenzel et al.: Bayesian Nonlinear Support Vector Machines for Big Data.
Files are provided in .jl format; containing Julia language code: a high-performance dynamic programming language for numerical computing. These files can be accessed by openly available text edit software. To run the code please see the description below or the more detailed wiki
BSVM.jl - contains the module to run the Bayesian SVM algorithm.
AFKMC2.jl - File for the Assumption Free K MC2 algorithm (KMeans)
KernelFunctions.jl - Module for the kernel type
DataAccess.jl - Module for either generating data or exporting from an existing dataset
run_test.jl and paper_experiments.jl
- Modules to run on a file and compute accuracy on a nFold cross validation, also to compute the brier score and the logscore
test_functions.jl and paper_experiment_functions.jl - Sets of datatype and functions for efficient testing.
ECM.jl - Module for expectation conditional maximization (ECM) for nonlinear Bayesian SVM
For datasets used in the related experiments please see https://doi.org/10.6084/m9.figshare.5443621
Requirements
The BayesianSVM only works for version of Julia > 0.5. Other necessary packages will automatically be added in the installation. It is also possible to run the package from Python, to do so please check Pyjulia. If you prefer to use R you have the possibility to use RJulia. All these are a bit technical due to the fact that Julia is still a young package.
Installation
To install the last version of the package in Julia run
Pkg.clone("git://github.com/theogf/BayesianSVM.jl.git")
Running the Algorithm
Here are the basic steps for using the algorithm :
using BayesianSVM
Model = BSVM(X_training,y_training)
Model.Train()
y_predic = sign(Model.Predict(X_test))
y_uncertaintypredic = Model.PredictProb(X_test)
Where X_training should be a matrix of size NSamples x NFeatures, and y_training should be a vector of 1 and -1
You can find a more complete description in the Wiki
Background
We propose a fast inference method for Bayesian nonlinear support vector machines that leverages stochastic variational inference and inducing points. Our experiments show that the proposed method is faster than competing Bayesian approaches and scales easily to millions of data points. It provides additional features over frequentist competitors such as accurate predictive uncertainty estimates and automatic hyperparameter search.
Please also check out our github repository:
github.com/theogf/BayesianSVM.jl
Funding
This work was partly funded by the German Research Foundation (DFG) award KL 2698/2-1
History
Research Data Support
Research data support provided by Springer Nature.Usage metrics
Categories
Keywords
Bayesian Approximative InferenceSupport Vector MachinesKernel MethodsBig DataBayesian nonlinear support vector machinesSVMStatistical machine learningstochasticuncertainty quantificationclass membership probabilitiescancer screeningdecision makingsupervised classification algorithmclassification algorithmBayesian inference techniquesmachine learning
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC