MOESM2 of CGGBP1 regulates CTCF occupancy at repeats

Additional file 2: Figure S1. (A) Human juvenile fibroblasts co-immunostained for CGGBP1 (Green) and CTCF (Red). Nuclei were counterstained with DAPI (blue). Mean fluorescence intensities for CGGBP1 (green) and CTCF (red) were normalized along the line-marked segment and plotted using ImageJ. Normalized signals along the line segment drawn through a midbody shows colocalization of CGGBP1 and CTCF. (B) Human juvenile fibroblasts co-immunostained for CGGBP1 (red) and CTCF (Green). Nuclei were counterstained with DAPI (blue). All images were captured with confocal plane of 1.601 µm. Figure S2. PLA (red foci) confirms CTCF-CGGBP1 interaction in situ. Nuclei were stained with DAPI (blue). CGGBP1-CTCF interaction was stronger in the nuclei than in cytoplasm (inset of mouse anti-CGGBP1:rabbit anti-CTCF sample). No significant interaction was observed in IgG and no-primary antibody negative controls (inset of no-primary antibody sample). All images were captured with confocal plane of 1.601 µm. Figure S3. Cytoplasmic and nuclear fractions were separated from HEK293T cells by REAP protocol. The upper panel shows immunoblot results for cytoplasmic marker GAPDH. The middle and the lower panels show immunoblot results for a nuclear protein Histone H3 using two different antibodies (H3K4me3 and H3K27me3). Equal volumes of cytoplasmic and nuclear fraction lysates were run in the lanes. Figure S4. The closest distance between starved RM CGGBP1 peak midpoint and transcription factor peak midpoint was determined by using bedtools closest. Frequency distribution of closest distances was plotted in bin of 0.5 kb for starved RM CGGBP1 peaks (A) and stimulated RM CGGBP1 peaks (B). Figure S5. HEK293T cells were transduced with control shRNA lentivirus, CGGBP1-shRNA lentivirus and CGGBP1-overexpression lentivirus, respectively. The upper panel shows immunoblot results for CGGBP1 and lower panel shows same for GAPDH loading control. Figure S6. The distribution of CTCF reads for repeat-masked peaks was plotted for CT, KD and OE samples. CT reads at repeat-masked CTCF CT peaks was plotted for 1 kb flanks in bin size of 10 (A). The distribution of published CTCF reads in prostate epithelial cells (ENCFF098DGZ) (B), A549 (ENCFF9810JS) (C) and HEK293 (ENCFF183AAP) (D) was also plotted at CTCF CT peaks. (E to H) The distribution of CTCF KD reads at repeat-masked CTCF KD peaks was plotted for 1 kb flanks in bin size of 10 (E). The distribution of published CTCF reads in prostate epithelial cells (ENCFF098DGZ) (F), A549 (ENCFF9810JS) (G) and HEK293 (ENCFF183AAP) (H) was also plotted at CTCF KD peaks. (I to L) The distribution of CTCF OE reads at repeat-masked CTCF OE peaks was plotted for 1 kb flanks in bin size of 10 (I). The distribution of published CTCF reads in prostate epithelial cells (ENCFF098DGZ) (J), A549 (ENCFF9810JS) (K) and HEK293 (ENCFF183AAP) (L) was also plotted at CTCF OE peaks. Figure S7. From RM CGGBP1 peaks with tag count more than 10, peaks having summits in central one-third region of peak length were filtered out for analysis. Sequences of these summit regions of the selected peaks (start = (peak start + 0.4 × peak length) and end = (peak start + 0.667 × peak length)) were fetched from repeat-masked hg38 and subjected to de novo motif search using DREME (minK 8). Despite repeat-masking these RM CGGBP1 peaks were centrally enriched with motifs with sequences that correspond to subsequences of Alu-SINEs (A) and L1-LINEs (B). The occurrences of these repeat-derived motifs were observed for starved (locations in blue) as well as stimulated (locations in red) peaks. Some motifs occur more than once on the transposon consensus sequences. Figure S8. Distribution of alignment scores of reads was plotted for mapping on hg38 masked and unmasked genome. End-to-end alignment score is represented on X-axis and percentage of aligned reads on the Y-axis. Figure S9. CTCF reads (CT, KD and OE) distribution on UCSC LINEs was plotted. LINEs coordinates were scaled to 0.3 kb and signal was plotted for 1 kb flanks by using kmeans clustering option. Figure S10. Repeat content analysis of reads prior to mapping, post-mapping and CTCF peaks. No significant difference between CT and KD was observed for presence of LINEs or SINEs in reads subjected to repeat content analysis prior and post-mapping. However, a significant difference is observed in LINE content between CT and KD on the peaks. Figure S11. The distribution of CTCF reads for repeat-unmasked peaks was plotted for CT, KD and OE samples. CT reads at repeat-unmasked CTCF CT peaks was plotted for 1 kb flanks in bin size of 10 (A). The distribution of published CTCF reads in prostate epithelial cells (ENCFF098DGZ) (B), A549 (ENCFF9810JS) (C) and HEK293 (ENCFF183AAP) (D) was also plotted at CTCF CT peaks. (E to H) The distribution of CTCF KD reads at unmasked CTCF KD peaks was plotted for 1 kb flanks in bin size of 10 (E). The distribution of published CTCF reads in prostate epithelial cells (ENCFF098DGZ) (F), A549 (ENCFF9810JS) (G) and HEK293 (ENCFF183AAP) (H) was also plotted at CTCF KD peaks. (I to L) The distribution of CTCF OE reads at unmasked CTCF OE peaks was plotted for 1 kb flanks in bin size of 10 (I). The distribution of published CTCF reads in prostate epithelial cells (ENCFF098DGZ) (J), A549 (ENCFF9810JS) (K) and HEK293 (ENCFF183AAP) (L) was also plotted at CTCF OE peaks. Figure S12. (A) The distribution of repeat-masked (top) and repeat-unmasked (bottom) CTCF reads in randomly picked 0.5 kb long 1 million genomic regions for CT, KD and OE samples. CT shows a bipolar distribution pattern with preponderance of read-free and highly read-rich region with a paucity of regions with moderate read density strongly when repeats are unmasked (bottom) as compared to repeat-masked (top). On the contrary, including the repeats shifts the KD and OE read distribution patterns toward the center with a majority of moderate read density regions. (B) Principal Component Analysis (PCA) to find the patterns of differences between CT, KD, OE and input (upper panel) shows that all the three ChIP samples differ from input in different ways. The specificity of CTCF ChIP causes a difference from input that is majorly PC1 for CT, majorly PC2 for OE and a mix of PC1 and PC2 for KD. Figure S13. Pie chart represents CTCF peaks segregation according to presence or absence of CTCF motif or LINEs in peaks. Pie-charts in top for CT (blue), middle for KD (red) and bottom for OE (green) peaks. The panel on the right represents the CTCF read distribution along with standard deviation plotted in immediate flanks of the Motif-negative LINE-positive CTCF peaks (top) and Motif-positive LINE-negative CTCF peaks (bottom) for CT (blue), KD (red) and OE (green). Figure S14. Distribution of CTCF reads at replication origin was plotted for 5 kb flanks in bin size of 10 (A). Distribution of CTCF reads at enhancers (UCSC Regulation datasets) was plotted for 5 kb flanks in bin size of 10 (B). Figure S15. Distribution of histone modification reads for CT and KD samples was plotted in 1mb upstream and downstream of LAD boundary. Histone modification reads counts in bin size of 1 kb was plotted for CT (blue) and KD (red) peaks at LAD start sites (A) and LAD end sites (B). Figure S16. Distribution of CTCF and histone modification reads at UCSC CTCF binding sites was plotted for 5 kb flanks in bin size of 10 (A). Variation of CTCF and histone reads in bin size 10 at UCSC CTCF-binding sites was shown as difference from mean for CT and KD sample (B). Figure S17. Difference in histone modification read occupancy in upstream and downstream 10 kb flanks of the all exclusive peaks were compared between CT and KD. Reads coverage count for 10 kb upstream and downstream was converted to logarithmic scale (log base = 2). Log2 fold change (M = Upstream–Downstream) was plotted against average read count for CT (A) and KD (B). Figure S18. ΔM value for CT–KD exclusive peaks were calculated for the H3K4me3 reads. Those peaks with significantly different (ΔM value < − 2 to >+2) H3K4me3 profile in CT and KD are represented as blue for CT (M-CT diff) and red for KD (M-KD diff), whereas those peaks which H3K4me3 profile did not alter significantly in KD (ΔM value ranges from − 2 to + 2) are highlighted as green for CT (M-CT indiff) and yellow for KD (M-KD indiff) (A). Similarly, ΔM value was calculated for KD–CT exclusive peaks and those peaks which showed substantial changes (ΔM value < − 2 and >+ 2) in H3K4me3 profile in KD are represented as red for KD (M-KD diff) and blue for CT (M-CT diff). Peaks which H3K4me3 profile did not change substantially by CGGBP1 depletion are shown as yellow for KD (M-KD indiff) and green for CT (M-CT indiff) (B). (C and D) Similarly, ΔM value for CT–KD exclusive peaks were calculated for the H3K27me3 reads. Those peaks with significantly different (ΔM value < − 1 to >+ 1) H3K27me3 profile inCT and KD are represented as blue for CT (M-CT diff) and red for KD (M-KD diff), whereas those peaks which H3K27me3 profile did not alter significantly in KD (ΔM value ranges from − 1 to + 1) are highlighted as green for CT (M-CT indiff) and yellow for KD (M-KD indiff) (C). Similarly, ΔM value was calculated for KD–CT exclusive peaks and those peaks which showed substantial changes (ΔM value < − 1 and >+ 1) in H3K27me3 profile in KD are represented as red for KD (M-KD diff) and blue for CT (M-CT diff). Peaks which H3K27me3 profile did not change substantially by CGGBP1 depletion are shown as yellow for KD (M-KD indiff) and green for CT (M-CT indiff) (D). Figure S19. The closest distance between permissive TSSs and CGGBP1-dependent CTCF-binding sites was obtained by bedtools closest. Distribution of distance between CGGBP1-dependent CTCF-binding sites and permissive was plotted.

Keyword(s)

panels show immunoblot results ENCFF 183AAP bin size 10 CTCF ChIP causes repeat-unmasked CTCF CT peaks UCSC CTCF binding sites repeat-masked CTCF KD peaks 0.5  kb M-CT diff End-to-end alignment score PCA TSS Figure S 14. Distribution Figure S 4. 1  kb flanks Reads coverage count Principal Component Analysis OE samples 1.601 µ M-CT indiff bin size repeat-masked CTCF OE peaks CTCF CT peaks HEK 293 Figure S 5. HEK 293T cells control shRNA lentivirus histone modification 5  kb flanks Figure S 3. Cytoplasmic confocal plane CGGBP 1 ENCFF 098DGZ CTCF KD CGGBP 1-CTCF interaction MOESM Figure S 18. Δ M value DAPI H 3K Equal volumes Figure S 2. PLA CGGBP 1-dependent CTCF-binding sites repeat-masked hg 38 majorly PC 1 majorly PC 2 Figure S 10. Repeat content analysis bipolar distribution pattern CTCF OE SINE Motif-positive LINE-negative CTCF peaks CTCF peaks segregation M-KD indiff prostate epithelial cells Figure S 17. Difference repeat-masked CTCF CT peaks CTCF-CGGBP 1 interaction Figure S 6. transcription factor peak midpoint Figure S 7. UCSC Regulation datasets M-KD diff Figure S 19. DREME Figure S 9. CTCF H 3K ENCFF 9810JS HEK 293T cells CGGBP 1 depletion 10  kb flanks UCSC CTCF-binding sites RM CGGBP 1 peaks immunoblot results CTCF OE peaks no-primary antibody sample fibroblasts co-immunostained Motif-negative LINE-positive CTCF peaks transposon consensus sequences cytoplasmic marker GAPDH protein Histone H 3 Figure S 16. Distribution GAPDH loading control LAD end sites Figure S 15. Distribution H 3K profile Δ M value 1.601 µ m Figure S 8. Distribution CGGBP 1-shRNA lentivirus Figure S 11. CGGBP 1-overexpression lentivirus Figure S 13. Pie chart 1-LINE CTCF KD peaks RM CGGBP 1 peak midpoint Figure S 12.

License

CC BY + CC0