posted on 2015-11-18, 05:00authored byChristelle Robert, Ronan Kapetanovic, Dario Beraldi, Mick Watson, Alan Archibald, David Hume
FANTOM5 promoter comparative analysis framework. FANTOM5 promoters were extended from 400 bases upstream to 100 bases downstream of their main TSS. Orthologous genomic regions were extracted for all genes within the target genome –orthologs between the species the FANTOM5 promoters belong to (human/mouse) and that of the target genome (pig/human/mouse). These regions were extracted as 2.1 Kb windows containing 2 Kb upstream and 100 bp downstream from each orthologous genes’ 5’-end. Promoters mapping to at least one known orthologous region were reported with their mapping locations, while the remaining set of promoters were mapped to the whole target genome (repeat-masked). The set of uniquely mapped promoters is reported with their genomic locations. The multimapped promoters were filtered based on the score ratio of the top two best hits -the top hit was considered to be uniquely mapped if the score ratio (s2/s1) between the second hit (s2) and the first (s1) was below 0.95. Failing the score ratio criteria, one of the two top hits was considered a single hit whenever it was located on a chromosome and all other hits were located on unplaced scaffolds. The unmapped promoters were re-aligned (2nd run - see methods) and the same procedure was followed to report the uniquely mapped promoters. Two sets of genes were extracted from the final set of unmapped FANTOM5 human promoters to the pig genome for GO terms enrichment analysis (see text): the set of genes with at least one FANTOM5 promoter unmapped (referred to as genes_tss) and the set of genes with all associated FANTOM5 promoters unmapped (referred to as genes_none). (PDF 259 kb)
Funding
Biotechnology and Biological Sciences Research Council