Additional file 1: Table S1. of Evidence-based green algal genomics reveals marine diversity and ancestral characteristics of land plants

Average GC%, size, protein-encoding gene count and density per M. pusilla CCMP1545 chromosome. Finished chromosomes are indicated. Table S2. Ortholog similarity in prasinophytes. The gene fraction is the ortholog count divided by the number of proteins in the smaller set. Table S3. OrthoGroups numbers across the Class II prasinophytes (see also Fig. 4) and number of OrthoGroups with >1 protein from the respective taxon in this analysis. Note that the higher redundancy of transcriptome-based proteomes of D. tenuilepis and CCMP2099 is likely due to artefacts of the method used. Table S4. Predicted proteins present in Micromonas and Dolichomastix, but not in Bathycoccus or Ostreococcus. Table S5. Genes encoding proteins in the peptidoglycan biosynthesis pathway identified here in prasinophytes, streptophytes and glaucophytes. Gene IDs from the new annotation generated here as shown in the JGI system (CCMP1545, RCC299, Ostreococcus RCC809), CAMPEP (CCMP2099, D. tenuilepis, N. pyriformis, P. salinarum, G. wittrockiana, see www.iplantcollaborative.org ), Genbank (O. tauri, O. lucimarinus, and B. prasinos) and Phytozome version 11 (all others). tblastn: Gene found using tblastn against the genome (streptophytes) or transcriptome (CCMP2099, D. tenuilepis), no corresponding gene model exists. Table S6. EST support for Viridiplantae peptidoglycan genes (tblastn with predicted peptide, E-value cutoff 10−15). A maximum of five ESTs are listed for each. Table S7. Signal peptides on Viridiplantae peptidoglycan pathway enzymes found using TargetP. C: Chloroplast; M: Mitochondrion; S: secreted; −: no signal; n/a: protein was located by tblastn in the genome and no corresponding gene model exists. Table S8. Greencut2 family proteins in prasinophytes. (XLSX 167 kb)