The data sources for the manuscript titled "The emergence of Sox and POU transcription factors predates the origins of animal stem cells" are provided in individual zip files.
Each zip file contains the source data corresponding to the main figures and supplementary figures. Detailed descriptions of the files can be found in the readme files located within folders containing multiple files.
The resource table and supplementary tables are included.
Brief description for each zip:
Fig 1 Alignment files, superimposed structural models, energy logo matrix, Spec-seq binding energies, uncropped EMSA gels. Details are described in the readme file.
Fig 2 Statistics of iPSC count used for plotting and generating heatmap; 3 biological replicates with the experiment_1 and experiment_2 of 2 technical replicates, experiment_3 of 3 technical replicates
Fig 3 Uncropped gel images, quantified bands and structural models (PDB files). Details are described in the readme file.
Fig 4 Statistics of iPSC count which used for calculating reprogramming efficiency and generating the box plot
Fig 5 Alignment files, Uncropped EMSA gels, energy logo matrix, Spec-seq binding energies, SELEX sequence logos for cycle 3. Details are described in the readme file.
Fig S1_S9_S10_S11 Phylogenetic analysis of holozoan Sox and nonSox HMG domains. Protein sequences were sampled, an alignment was computed, trimmed and used as input to compute a maximum-likelihood phylogeny. Tree files with statistical support values are provided in the dataset. Constrained tree searches and AU-test were also performed. The phylogenies were used for ancestral sequence reconstruction. Further details are given in the provided README.txt file.
Fig S2 Multiple Sequence Alignment files and Relative binding energies
Fig S3 Statistic of iPSC counts for plotting
Fig S4 Statistic of iPSC counts for plotting and the raw qPCR data for analysis
Fig S5 Sequencing results of 3 clonal iPSC lines reprogrammed by chimeric-Salhel-I, statistic of iPSC counts for plotting and the raw qPCR data for analysis
Fig S6 Replicate gel images of whole cell lysate EMSA
Fig S7 Statistic of iPSC counts for plotting and generating heatmap; 2 biological replicates each with 2 technical replicates
Fig S8 Raw gel imaging of PCR genotyping and sequencing results of genotyping clonal iPSC lines reprogrammed by Pchi-mutants
Fig S12 Multiple sequence alignment of POU sequence from Unicellular and Metazoan POU factors.
Fig S13 Holozoan Homeodomain phylogeny We used HMMER3 search to extract all homeobox domain containing proteins present in Homo sapiens, Drosophila melanogaster, Nematostella vectensis, Amphimedon queenslandica, Trichoplax adhaerens and Mnemiopsis leidyi, as well as all holozoan sequences described above. The resulting 756 sequences were aligned using MAFFT, trimmed uing trimAl, and IQTREE to build the phylogeny. Another tree was built using the same procedure, but with fewer sequences, including all potential POU members and Six as an outgroup, including new sequences from sponges, placozoans and ctenophores (Hormiphora californensis). This tree spanned both the homeobox and POU domains.