作者: Leszek P. Pryszcz , Jaime Huerta-Cepas , Toni Gabaldón
DOI: 10.1093/NAR/GKQ953
关键词:
摘要: Reliable prediction of orthology is central to comparative genomics. Approaches based on phylogenetic analyses closely resemble the original definition and paralogy are known be highly accurate. However, large computational cost associated these a limiting factor that often prevents its use at genomic scales. Recently, several projects have addressed reconstruction collections high-quality trees from which relationships can inferred. This provides us with opportunity infer evolutionary genes multiple, independent, trees. Using such strategy, we combine information derived different databases, predict for 4.1 million proteins in 829 fully sequenced genomes. We show number independent sources made, as well level consistency across predictions, used reliable confidence scores. A webserver has been developed easily access data (http://orthology.phylomedb.org), users global repository phylogeny-based predictions.