Additional file 2: of Carbohydrate metabolic systems present on genomic islands are lost and gained in Vibrio parahaemolyticus

2019-05-27T05:00:00Z (GMT) by Abish Regmi Ethna Boyd
Figure S1. Variants of the Metabolic islands containing a citrate fermentation gene cluster in V. parahaemolyticus. Gray shade indicates homologous regions between strains. Arrows represent ORFs, ORFs with the same color represent functionally similar proteins. Gray arrows, genes coding hypothetical and other functional proteins. Black arrows represent transposases. Figure S2. Variants of the Metabolic island containing l-rhamnose utilization and OGA clusters. A. Comparative analysis of l-rhamnose gene cluster. Gray shade indicates homologous regions between strains. Arrows represent ORFs, identical colored ORFs indicate similar function. Gray arrows, genes coding hypothetical and other functional proteins. Black arrows indicate transposases. B. OGA metabolism pathway with enzymes involved and ORFs identified in the 135-kb metabolic island of FORC_022. OGA catabolism cluster in RIMD2210633 is also shown. uGA2, unsaturated galacturonate dimer, GH, glycoside hydrolase. Figure S3. Genomic analysis of l-arabinose catabolic gene cluster. A. Comparative analysis region between VPA0441 and VPA0450. Gray shade, region of nucleotide homology B. d-galactitol pathways with proteins and ORFs identified in CFSAN007457. C. l-arabinose gene cluster present in V. parahaemolyticus strain RIMD2210633. Arrow indicated ORFs. Figure S4. Phylogenetic analysis of AraD among Vibrionaceae. AraD from V. parahaemolyticus was used as a seed to identify homologues within the Vibrionaceae. Most OTUs representing multiple strains. The evolutionary history was inferred using the Neighbor-Joining method [26]. The optimal tree with the sum of branch length = 2.24462315 is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) are shown next to the branches [27]. The evolutionary distances were computed using the Dayhoff matrix based method and are in the units of the number of amino acid substitutions per site [28]. The rate variation among sites was modeled with a gamma distribution (shape parameter = 5). All ambiguous positions were removed for each sequence pair (pairwise deletion option). There were a total of 507 positions in the final dataset. Evolutionary analyses were conducted in MEGA X [29]. Figure S5. l-arabinogalactan metabolism cluster within a 62-Kb island. A. A 62-kb island containing l-arabinogalactan catabolic cluster is shown. Arrows indicate ORFs. Gray arrows, genes coding hypothetical and other functional proteins. Black arrow indicates transposase. B. l-arabinogalactan utilization pathway identified in CH25 and ORFs for l-arabinose utilization in RIMD2210633 and FORC_008. Figure S6. Predicted model of the 135-kb Metabolic island emergence. The putative progenitor metabolic island was not identified in any strain in the genome databases. Shown is the most evolutionary parsimonious steps required to explain how these variants arose. In all, we identified 10 variants and for the sake of simplicity, we only show five genes clusters from these islands, which also contain restriction modification systems amongst others. (PPTX 7712 kb)