Additional file 1: of Regulatory network changes between cell lines and their tissues of origin

Similarity between cell lines and their tissues of origin based on gene expression. (A) Principal component analysis (PCA) was performed to evaluate possible batch effects in the gene expression data. Samples are labeled based on the year the sample was analyzed by the GTEx project, and the plots show the sample separation for the first 7 PCs. (B) Number of genes expressed in each group (LCL, whole blood, fibroblast, skin). Genes were separated into biological classes using the definitions from GENCODE release 19 (GRCh37.p13). (C) PCA of paired samples between the two tissues and cell lines (total of 89 subjects with all four samples) based on the normalized expression of all genes. The primary axis separates samples by tissue; the secondary axis separates primary tissue from cell lines. (D) To access whether the PCA results were dependent on the 89 samples chosen because they were present in all four groups, we repeated the analysis 100 times using 89 randomly selected samples in each group. The left panel shows the projection of the first 2 PCs for one random analysis, and right panel shows the distribution of PC1 and PC2 for each of the 100 analyses. (PDF 267 kb)