Data and code on serum Raman spectroscopy as an efficient primary screening of coronavirus disease in 2019 (COVID-19)

This fileset consists of 13 data files, 1 code file and 2 ReadMe files.

The dataset data.mat is in .mat file format and therefore not openly-accessible. The following datasets, are an openly-accessible version of the .mat file:


Fig2_1.txt in .txt file format

Fig2_2.txt in .txt file format

Fig2_3.txt in .txt file format

Fig2_4.txt in .txt file format

Fig2_5.txt in .txt file format

Fig2_6.txt in .txt file format

raw_COVID.txt in .txt file format

raw_Helthy.txt in .txt file format

raw_Suspected.txt in .txt file format

raw_Tube.txt in .txt file format

table2_data.txt in .txt file format

wave_number.txt in .txt file format

The code file is the following: code.m in .m file format

The two ReadMe files are the following: readme.txt in .txt file format and readme.m in .m file format.


Data in Fig2_1.txt, Fig2_2.txt, Fig2_3.txt, Fig2_4.txt, Fig2_5.txt and Fig2_6.txt were used to plot Figure 2 in the related manuscript.

raw_COVID.txt contains the raw Raman spectroscopy data from the serum samples obtained from the 53 confirmed COVID-19 patients.

raw_Helthy.txt contains the raw Raman spectroscopy data from the serum samples obtained from healthy individuals.

raw_Suspected.txt contains the raw Raman spectroscopy data from the serum samples obtained from suspected cases (individuals suspected of COVID-19 infection)

raw_Tube.txt contains the raw spectra data from cryopreservation tubes with saline solution inside.

wave_number.txt contains data of the Raman Spectrum shift.

table2_data.txt was used to generate Table 2 in the related manuscript.

The code code.m was used for data processing.


Software needed to access data: data.mat can only be accessed using the Matlab software. Running the code code.m also requires Matlab.


Study aims and methodology: The recommended diagnosis method for the coronavirus disease (COVID-19 is a qPCR-based technique, however, it is a time consuming, expensive, and a sample dependent procedure with relative high false negative ratio. The aim of this study was to develop a widely available, cheap and quick method to diagnose COVID-19 disease based on Raman spectroscopy.

A total of 157 serum samples were collected from 53 confirmed patients, 54 suspected cases (fever but not COVID-19) and 50 healthy controls. Raman spectroscopy was used to analyse these samples and the machine learning support vector machine (SVM) method were applied to the spectral dataset to build a diagnostic algorithm.

The experimental set up consisted of a Volume Phase Holographic (VPH) spectrograph, deep-cooled CCD camera, and a Raman probe and laser.

A total of 2355 spectra from 157 individuals were imported to MATLAB (R2013a) software (Math-200 works, Inc.).

For more details on the methodology, please read the related article.