Springer Nature
Browse
13059_2021_2291_MOESM1_ESM.docx (42.2 MB)

Additional file 1 of MEDALT: single-cell copy number lineage tracing enabling gene discovery

Download (42.2 MB)
journal contribution
posted on 2021-02-24, 04:47 authored by Fang Wang, Qihan Wang, Vakul Mohanty, Shaoheng Liang, Jinzhuang Dou, Jincheng Han, Darlan Conterno Minussi, Ruli Gao, Li Ding, Nicholas Navin, Ken Chen
Additional file 1: Fig. S1. Methodology of the framework. a. Illustration of minimal event distance (MED) calculation. b. Average lineage partitioning accuracy (LPA) on 100 simulation datasets without noise. c. Estimating lineage specific cumulative fold level (CFL). d. Estimating significance of CFL in an individual sample. e. AUC of non-random fitness-associated alterations (FAAs) detection based on LSA, permutated SCCN matrix rather than reconstructing tree, GISTIC test and one-side Wilcoxon signed-rank test on 100 simulation datasets without noise. f. Identification of non-random fitness-associated CNAs in a cohort of samples. g. Identification of parallel evolution CNAs in an individual sample. Fig. S2. The efficiency of MEDALT based on 9 × 3 × 20 simulation datasets with the population size from 400 to 2000, genome size from 100 to 1000. Fig. S3. Simulation and evaluation of CNA evolution model. a. Illustration of simulated genomic structural rearrangements in the evolution of a tumor. K represents the number of CNAs during ∆t period. r represents the number of adjacent regions which are affected by a CNA. TD: tandem duplication. TER: terminal deletion. DEL: interstitial deletion. BFB: breakage fusion bridge. b. Simulated and inferred copy number evolution distance between two genomes. Compared with MED are commonly used distance metrics Hamming, Euclidean and Manhattan. c. The AUC for identifying FAAs based on different combinations of models. Wilcox represents one-side Wilcoxon signed-rank test. d. The effects of noise on FAAs detection. Fig. S4. SCCN profile of TNBC patient KTN102. Each row represents a cell from pre-, mid-, or post-treatment. Fig. S5. Average distance between root node and cells from pre-, mid- or post-treatment based on MEDALT, maximal parsimony (MP), neighbor-joining (NJ) and maximum likelihood tree. FC refers to the fold changes between the average distance to root of the mid−/post- cells and that of the pre-treatment cells. Fig. S6. Stratified average CNA rates and fractions of DDR genes loss among lineages (distinguished by colors) in 6 primary TNBC samples. Fig. S7. Gene set enrichment analysis (GSEA) for genes identified by LSA in patient t1. Colors correspond to branches. Fig. S8. Significant genes identified through cohort LSA from the TNBC scDNA-seq data. a. Venn diagram of the genes identified by the MEDALT, MP and GISTIC but not reported in oncoKB, COSMIC and intOGen. b. Overall survival (OS) analysis of breast cancer patients in TCGA. c. Progression free survival (PFS) analysis of breast cancer patients in TCGA. d. Overall survival analysis of breast cancer patients in the METABRIC. e. The fraction of cancer genes overlapping with events which were significant in single lineage (#Lineage = 1), multiple lineages (#Lineage > 1), parallel evolution test ((#Lineage > 1& PLSA < 0.001) and cohort LSA (inter-tumor recurrent). Fig. S9. Results of multiple myeloma patient 60,359. a. Inferred MEDALT and heatmap based on Pearson’s correlation of the inferCNV profiles between cells ordered by lineages in MEDALT. b. Inferred trajectory from Monocle and heatmap of Pearson’s correlation of the inferCNV profiles between cells ordered by states defined by Monocle. Table S1. The algorithm for minimal event distance (MED) inference. Table S2. The algorithm for rooted directed minimal spanning tree reconstruction. Table S3 Information on the TNBC data. Table S4. Annotation of the broad CNAs identified in TNBCs based on literature. Table S5. Information on the scRNA-seq data.

Funding

National Institutes of Health CPRIT University of Texas MD Anderson Cancer Center National Cancer Institute Chan Zuckerberg Initiative DAF

History

Usage metrics

    Genome Biology

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC