MOESM1 of Integrated epigenomic analysis stratifies chromatin remodellers into distinct functional groups

Additional file 1: Figure S1. A) Gene expression of each remodeller protein in LNCaP cells from RNA-seq (mean and SE; n = 3). B) Gene expression of remodeller proteins in TCGA data set of 486 prostate epithelial tumours (mean and SE; n = 486). C) Gene expression of each remodeller in PrEC cells from RNA-seq (mean and SE; n = 3), D) Gene expression of remodellers in TCGA data set of normal prostate epithelial tissue (mean and SE; n = 52). E) Scatter plot of logRPKM values for TCGA data versus logTPM values for LNCaP and PrEC RNA-seq data. The PrEC comparison for PrEC and normal TCGA is shown in green, and the LNCaP compared to TCGA tumours is shown in red. Lines are linear regression line of best fit, and Pearson’s correlation coefficient between the two normal data sets (cor = 0.670013) and the two cancer data sets (cor = 0.8375552, SNF2L is excluded as an outlier) is shown under the plot. Figure S2. A–H) Histograms of chromatin remodeller binding sites, binned at 150 bp widths. Vertical line indicates the point of 750 bp (~ 5 nucleosomes). Figure S3. A) Heat map of ChromHMM emission profile based on the learned model from the epigenome roadmap. B) Heat map of chromatin state enrichment generated from the chromHMM analysis, where the percentage of each state in the genome is represented in the first column and the remaining columns are the enrichment of each chromatin state over annotated genomic features. C) Heatmap of H3K27ac, H3K4me1, H3K4me3 and p300 ChIP-seq signal and DNaseI signal at putative active enhancers, ± 2.5 kb from the centre of each enhancer, sorted by H3K27ac and H3K4me1 signal. Figure S4. A) Heatmap H3K4me3, RNA polII, and chromatin remodeller ChIP-seq signal and DNaseI signal at refseq-annotated promoters. Signal is plotted ± 2 kb from the transcription start site (TSS) and sorted by H3K4me3 signal. B, D) Venn diagram of chromatin remodeller binding site overlaps for Group 1 and Group 2 remodellers, respectively. C) Percentage of all Group 1 remodeller binding sites that are unique to each Group 1 remodeller and contain multiple Group 1 remodellers. E) Percentage of all Group 2 remodeller binding sites that are unique to each Group 2 remodeller and contain multiple Group 2 remodellers. Figure S5. A–O) Enrichment of chromatin remodeller binding sites across the 15 state chromHMM model based on the Epigenome roadmap (see Methods). Significant enrichment is defined as a score above one and significantly depleted as below one with Benjamini–Hochberg adj p-value, ***p < 0.001 or **p < 0.05. Figure S6. A–D) Genome-wide average distribution of ChIP-seq signals for histone modifications H3K4me3, H3K4me1, H3K27me3 and H3K9me3 ± 2 kb from the centre of chromatin remodeller binding sites. E–H) Pearson’s correlation score matrix of chromatin remodeller ChIP-seq signal at active, bivalent, facultative and constitutive promoters. Each matrix was ordered by hierarchical clustering. Figure S7. A–B) Violin plots of nucleotide frequency within remodeller binding sites for Group 1 and Group 2. C–D) Dinucleotide frequency within chromatin remodeller Group 1 and Group 2 binding sites. E) DNA methylation density within chromatin remodeller binding sites for each remodeller. F) CpG density within chromatin remodellers binding sites for each remodeller protein. G) CpG density of unmethylated CpG islands, methylated CpG islands and the whole genome. A significant difference in CpG density was detected between each of the groups (one-way ANOVA, ***p < 0.001). Figure S8. A) The genome divided into TADs (85.4%), TAD boundaries (2.6%) and unorganised chromatin (12.0%) using TADs and boundaries called to a 40 kb resolution from Hi-C data. B–C) GAT enrichment of chromatin remodellers at TADs and TAD boundaries, where significant enrichment is defined as a score above one and significantly depleted as below one with Benjamini–Hochberg adj p-value, ***p < 0.001 or **p < 0.05.