Springer Nature
4 files

Datasets and metadata supporting the published article: Somatic Genetic Aberrations in Benign Breast Disease and the Risk of Subsequent Breast Cancer

posted on 2020-05-27, 14:01 authored by Zexian Zeng, Andy H. Vo, Xiaoyu Li, Ali Shidfar, Paulette Saldana, Luis Blanco, Xiaoling Xuei, Yuan Luo, Seema A. Khan, Susan E. Clare
Benign breast disease is an established risk factor for breast cancer, with 30% of breast cancer cases reporting a history of benign breast disease. It is largely unknown how the development of breast cancer (BC) is transduced by somatic genetic alterations in the benign breast.
Here, the authors performed whole exome sequencing on the benign biopsies of patients, who subsequently developed breast cancer (cases), and matched controls, who have not developed breast cancer to date, to evaluate genetic aberrations within normal tissues, that are associated with malignancy.

Data access: Data supporting figures 1, 4 and 5, table 1, and supplementary tables 1-4, are publicly available in the figshare repository: https://doi.org/10.6084/m9.figshare.12191793. Whole exome sequencing data, generated during the current study, are publicly available in NCBI Sequence Read Archive (SRA) here: https://identifiers.org/insdc.sra:SRP219328. TCGA data supporting figure 2, were downloaded from the Genomic Data Commons (GDC) data portal, though a dbGaP application. The link to the relevant dbGaP study is https://identifiers.org/dbgap:phs000178.v1.p1.

IRB approval and patient consent: This study was carried out under the IRB-approved protocol Northwestern University NU 09B2. Eligible subjects provided informed consent for the use of their benign breast biopsy (BBB) blocks after the nature and possible consequences of the study were explained, and completed a survey detailing breast history and breast cancer risk factors.

Study aims and methodology: Here, the authors established a case-control study of women with a history of benign breast biopsy (BBB). To evaluate the molecular alterations that enable cancer development in the breast, the authors performed whole exome sequencing on the benign biopsies of patients, who subsequently developed breast cancer (cases), and matched controls, who have not developed breast cancer to date.
A total of 204 women participated in the study. Cases (n= 135) developed BC at least one year after BBB and controls (n=69) did not develop BC over an average of 17 years following BBB. 135 cases were matched to 69 controls by age and type of benign change: non-proliferative or proliferation without atypia (PDWA). Whole exome sequencing (WES) was performed for the BBB. Germline DNA (available from n=26 participants) was utilized to develop a mutation-calling pipeline, to allow differentiation of somatic from germline variants.
Using both aligned reads and identified mutations, the authors studied the genetic aberrations that distinguish cases from controls, including mutations and copy number variations (CNVs).
The authors validated their findings using data from TCGA breast cancer samples.
For more details on the methodology, please read the related published article.

Data supporting the figures, tables and supplementary files in the published article: Data file names, formats and direct links to the datasets are included in the file Zeng et al. xlsx.
The following datasets are part of this figshare data record: Supplementary data-NPJBCANCER-00456-R2-AccpetinPrinciple.xlsx in .xlsx file format, predictionClass.txt in .txt file format, and MutationNumber_MUC17.csv in .csv file format.


This study was supported in part by Breast Cancer Research Foundation, the Lynn Sage Cancer Research Foundation, and grant R21LM012618-01 from the National Institutes of Health.


Research Data Support

Research data support provided by Springer Nature