12859_2016_1278_MOESM1_ESM.pdf (33.26 kB)

Additional file 1: Figure S2. of FMAP: Functional Mapping and Analysis Pipeline for metagenomics and metatranscriptomics studies

Download (33.26 kB)
journal contribution
posted on 10.10.2016 by Jiwoong Kim, Min Kim, Andrew Koh, Yang Xie, Xiaowei Zhan
Workflow to create KEGG Filtered UniProt (KFU) Reference Cluster. First, UniProt ID mapping data was downloaded. 80.4 million protein accessions were in the data. To build connections between the UniProt database and KEGG orthology database, we used KEGG LinkDB API ( http://www.genome.jp/linkdb/ ) to select a subset from the UniProt proteins, and to retain only bacteria, archaea or fungi sequences. Next, we built connections between UniProt sequences and UniRef90 sequences via the UniProt ID mapping data, and retain only one-one correspondences. Finally, we obtained 1,995,269 sequences termed as KFU (KEGG filtered UniRef90), and all the sequences had a known relationship between UniRef 90 and KEGG orthology. Solid black lines with arrows indicate data processing steps. Solid black lines with diamond-shaped heads are direct one-to-one relationships. Dashed black lines with diamond-shaped heads are indirect one-to-one relationships. (PDF 33 kb)


National Institutes of Health