Springer Nature
Browse
1/1
3 files

Metadata supporting data files of the related article: Heterocellular gene signatures reveal luminal-A breast cancer heterogeneity and differential therapeutic responses

dataset
posted on 2019-08-02, 15:47 authored by Pawan Poudel, Gift Nyamundanda, Yatish Patil, Maggie Cheang, Anguraj Sadanandam
In this study, the authors investigated the heterogeneity of luminal-A breast cancers based on heterocellular gene signatures. The aim was to stratify luminal-A breast cancer samples according to well-characterised and cancer-associated heterocellular subtype signatures defined in colorectal cancers, representing stem, mesenchymal, stromal, immune and epithelial cell types.

While differences between intrinsic breast cancer subtypes have been well studied, heterogeneity within each subtype, especially luminal-A cancers, requires further interrogation to personalise disease management. At the molecular level, breast cancer was one of the first cancer types to be subtyped into intrinsic gene expression subtypes, with five to ten “intrinsic” subtypes now recognised based on gene expression or integrated molecular characteristics, respectively.

It has been established that while some luminal-A tumors are highly responsive to endocrine therapies like tamoxifen, a significant proportion possess intrinsic and/or acquired resistance. Despite luminal-A tumors being a relatively well-characterised breast cancer sub-type, genetic changes alone (mutations and copy number alterations) do not explain the entire spectrum of luminal-A heterogeneity. The factors leading to tumor heterogeneity, including in luminal-A tumors, are complex and include interactions between different cell types and the tumor microenvironment.

Methodology and aims:
The aim of this study was to further investigate breast cancer heterogeneity, especially in the luminal-A subtype, using heterocellular subtype signatures defined in colorectal cancers (CRCs). This was done similar to the application of breast cancer subtype signatures to other cancers and with an intention to identify low frequency and novel subtypes that are not apparent based on unsupervised approaches.

To characterise the breast cancers using heterocellular subtypes, the classifyCMS function from the published R package CMSClassifier was used. The authors applied the Consensus Molecular Subtypes (CMS) of Colorectal Cancer Classifier to two independent breast cancer datasets (the Cancer Genome Atlas; TCGA n=817 and GSE42568 n=104).
The intrinsic breast cancer classification for GSE42568 dataset was performed using an R-based Bioconductor package-genefu.
Raw gene expression and the corresponding survival data of patient tumors that were analysed during this study, were downloaded from gene expression omnibus (GEO) –GSE42568 and GSE6532 (combined Affymetrix Human Genome U133A and U133B arrays were used). The gene expression profiles for the TCGA breast cancer data were downloaded from the cBioPortal repository.
Affymetrix GeneChip microarray data processing and quality control were performed using robust multi-array normalisation (RMA) from R-based Biocunductor package –affy.
Hypergeometric sample enrichment analysis of the TCGA dataset was carried out to understand the relationship between the intrinsic breast cancer subtypes and heterocellular subtypes.
The results of luminal-A heterogeneity described by heterocellular subtypes, were validated using an additional dataset enriched for estrogen receptor (ER)-positive tumors (luminal-A, GSE6532).
To further characterise heterocellular subtypes in luminal-A breast cancers, heatmap analysis of heterocellular gene expression signatures was performed, comparing luminal-A to non-luminal-A (other subtypes) breast cancer samples. For the heatmap, genes were clustered (hierarchical clustering) by Cluster 3.0, using the default settings, followed by visualisation of the clusters using GENEE from GenePattern.
Gene enrichment analysis was performed to characterise the immune gene expression heterogeneity in luminal-A tumors.
To predict if inflammatory luminal-A tumors may potentially respond to anti-immune checkpoint blockade therapy, a “published expanded immune gene” signature (https://doi.org/10.1172/JCI91190) was used, which potentially predicts anti-PD1 immune-checkpoint responses in melanoma and other cancers.
Association between heterocellular subtypes and breast cancer phenotypes such as proliferation was performed using Kruskal-Wallis statistical test.
To assess the association of tamoxifen treatment response with heterocellular subtypes, the authors evaluated the relationship between the heterocellular luminal-A subtypes and clinical outcomes in patient samples treated with tamoxifen using the GSE6532 dataset. These results were then compared to recurrence free survival (RFS) from risk of occurrence (ROR) and OncotypeDx.
RNAseq data from adjacent normal breast samples were used to evaluate the relationship between heterocellular subtypes and fat, stroma and epithelium in adjacent normal breast tissue. Please see the related article and its supplementary information files for more details on the methodology.

Datasets:
Data file Poudel et. al_.xlsx provides persistent links, file formats and repository names of the publicly available datasets that were analysed during this study and in turn, used to generate the figures, tables and supplementary figures and tables in this article. Additional files/information derived from published articles, used to generate the figures and/or tables in this study, are given as doi links in the data descriptions below.

Supplementary table 1, SupplementaryTable_1_final_20190614.xlsx, and supplementary table 2, Supplementary Table_2_20190614.xlsx and their descriptions, are included in this metadata record.

Data supporting figure 1 show the association of breast cancer with heterocellular subtypes, including the proportion of consensus molecular subtypes (CMS) subtypes of colorectal cancer in multiple breast cancer datasets (TCGA and GSE42568) and the proportions of different heterocellular subtypes in luminal-A breast cancer samples (from TCGA).

Data supporting figure 2 show heterocellular subtype-based heterogeneity in luminal-A breast cancers. These include gene expression data of the top highly variable and selected marker genes between stem-like and other subtypes within the luminal-A breast cancer subtype and subtypes other than luminal-A (non-luminal A) from TCGA breast cancer data. This figure is also supported by gene set enrichment analysis (GSEA) data showing gene sets enriched in stem-like and inflammatory heterocellular subtype samples compared to the other subtypes from TCGA breast cancer.

Data supporting figure 3 show the enrichment of immune checkpoint genes, immune cells, expanded immune (18-gene) signature (https://doi.org/10.1172/JCI91190) and other phenotypes in luminal-A heterocellular subtypes. Gene set enrichment analysis (GSEA) was carried out to compare immune cell types enriched in inflammatory heterocellular subtype samples compared to the other subtypes using the Rooney et al gene sets (https://doi.org/10.1016/j.cell.2014.12.033).

Data supporting figure 4 show the association of heterocellular subtypes with published other luminal-A breast cancer subtype classifications (the Ciriello subgroups of luminal-A subtype (https://doi.org/10.1007/s10549-013-2699-3) and two Netanely et al. luminal-A breast cancer subtypes (https://doi.org/10.1186/s13058-016-0724-2)).

Data supporting figure 5 show the survival differences in heterocellular subtypes from ER-positive tamoxifen-treated samples.

Data supporting figure 6 summarise the luminal-A heterocellular subtypes epithelial-mesenchymal transition (EMT) and copy number alterations.

Data supporting supplementary figure 1: Data show the association between intrinsic breast cancer subtypes and normal breast tissue with heterocellular subtypes.

Data supporting supplementary figure 2: Data show a comparison of heterocellular subtypes and clusters of luminal-A breast cancers as defined by Aure et.al (https://doi.org/10.1186/s13058-017-0812-y).

Data supporting supplementary figure 3: Data normalisation and analysis for the GSE6532 dataset for estrogen receptor-positive and tamoxifen-treated samples, as well as survival differences in heterocellular subtypes from estrogen receptor-positive samples.

Data access: All the data analysed in this study were derived from publicly available datasets. The datasets analysed during this study, and in turn used to generate the figures and tables in this article can be found in the NCBI Gene Expression Omnibus (GEO) and cBioPortal for Cancer Genomics repositories and in the UCSC Xena browser (https://xena.ucsc.edu/). Please see Poudel et. al_.xlsx file for links to specific datasets.


Funding

PP was supported by Pancreatic Cancer UK Future Research Leaders Fund under the supervision of AS. The authors acknowledge NHS funding to the NIHR Biomedical Research Centre at The Royal Marsden and the ICR.

History

Research Data Support

Research data support provided by Springer Nature