Data on the cytotoxic T lymphocyte epitopes identified from the SARS-CoV-2 isolated in India

2020-04-22T10:50:01Z (GMT) by Viswajit Mulpuru Nidhi Mishra

This fileset consists of the following datasets:

18670_NetCTLpan.xls in xls file format

epitope_immunogenicity_score.txt in .txt file format

NetCTLpan_out_sorted.txt in .txt file format

The zipped folder BatchPeptideMatch-202003311256376185117084.zip contains the following datasets:

log.txt in .txt file format

perPeptideMatchDetails.txt in .txt file format

139 .txt files contained within the folder “PerPeptideMatchResults

The 3 zipped folders HPEPDOCK_results_3kps.tar.gz, HPEPDOCK_results_6at5.tar.gz and HPEPDOCK_results_6o9c.tar.gz, each contain 101 data files in .pdb file format.


Dataset 18670_NetCTLpan.xls contains data on the cytotoxic T lymphocyte epitopes identified from the SARS-CoV-2 isolated in India as predicted by NetCTLpan.

Dataset epitope_immunogenicity_score.txt contains the immunogenicity scores of all the predicted epitopes, after these were subjected to the Immune Epitope Database (IEDB) immunogenicity tool.

Dataset NetCTLpan_out_sorted.txt contains the NetCTLpan predictions for the HLA-A*03:01 allele.

Datasets log.txt and the 139 .txt files contained within the folder “PerPeptideMatchResults”, contain data from the peptide matching step, that was performed on sequence dataset ‘UniProtKB release 2020_01 plus isoforms | SwissProt | Isoform’ with target organism set as ‘Homo sapiens [9606]’.

.pdb files contained in the three zipped folders HPEPDOCK_results_3kps.tar.gz, HPEPDOCK_results_6at5.tar.gz and HPEPDOCK_results_6o9c.tar.gz: To further confirm the candidacy of the foreign epitopes as a vaccine, the top three foreign epitopes based on immunogenicity scores were subjected to molecular docking studies to confirm their interactions with the specified HLA at the peptide-binding groove. The molecular docking of the peptide epitope with the HLA structure was performed using HPEPDOCK Server. Each of the zipped folders therefore contains the results of the molecular docking studies using HPEPDOCK Server for the top three vaccine candidates (3kps, 6at5 and 6o9c).


Study aims and methodology: This study aimed to identify cytotoxic T cell (CTL) epitopes of SARS-CoV-2 Indian isolate for designing potential vaccine candidates which are effective on Indian population using an in-silico approach. The authors predicted the CTL epitopes for all those human leukocyte antigen supertypes (HLA) which have high allelic frequently in Indian population. Additionally, they further studied the immunogenicity, foreignness, and interactions between the epitopes and the HLA molecules by the means of molecular docking studies.

The amino acid sequence of the complete genome of Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) isolate of India (MT050493.1) with 9950 AA was retrieved from NCBI.

NetCTLpan version 1.1 was used to predict the CTL epitopes across the proteins coded by the SARS-CoV-2 Indian isolate.

All the predicted epitopes were subjected to the Immune Epitope Database (IEDB) immunogenicity tool to predict their immunogenicity score.

To filter out the vaccine candidates which are foreign to the human body, all the epitopes that showed positive immunogenicity, were subjected to Multiple Peptide Match tool against human reference proteome.

To further confirm the candidacy of the foreign epitopes as a vaccine, the top three foreign epitopes based on immunogenicity scores were subjected to molecular docking studies to confirm their interactions with the specified HLA at the peptide-binding groove. The molecular docking of the peptide epitope with the HLA structure was performed using the HPEPDOCK Server.

For more details on the methodology, please read the published article.