UniProt-GOA: A Central Resource for Data Integration and GO Annotation.

作者: Tony Sawford , Claire O'Donovan , Maria Jesus Martin , Mélanie Courtot , Elena Speretta

DOI:

关键词: Context (language use)InterProEnsemblAnnotationUniProtData integrationComputational biologyComputer scienceControlled vocabularyOntology (information science)

摘要: The Gene Ontology (GO) is a well-established, structured vocabulary used in the functional annotation of gene products. GO terms are to replace multiple nomenclatures by scientific databases that can hamper data integration. Currently, consists more than 41,000 describing molecular function, biological process and subcellular location product generic cell. UniProt-Gene Annotation (UniProt-GOA) project [1] provides high-quality manual electronic annotations, historically proteins within UniProt Knowledgebase. Recently, support for RNAs via RNAcentral IDs macromolecular complexes, identified IntAct Complex Portal IDs, was added. For many species, no experimental available: annotations only source information investigation, it therefore critical solid pipelines integration across resources be implemented. example, we rely on Ensembl [2] automatically according orthology between or InterPro [3] identify with similar signatures which conserved function associated. In September 2015, an additional 1.5 million from Unified Rule (UniRule [4]) system were added electronically. addition increasing number available, UniProtGOA also supports, as part curation, about context term, such target extensions. now describe located specific compartment cell type (e.g., localizes nucleus keratinocyte [5]). extensions amenable sophisticated queries reasoning. A typical use case researchers studying protein causative rare cardiac phenotype: they will interested cardiomyocytes differentiation all involved differentiation. files various reference proteomes released monthly, including human, mouse, rat, zebrafish, cow, chicken, dog, pig, Arabidopsis Dictyostelium, well file species UniProtKB. UniProt-GOA dataset queried through our user-friendly QuickGO browser6 downloaded parsable format EMBLEBI [7] Consortium FTP [8] sites. largest most comprehensive open-source contributor effort. has increasingly been integrated into tools aid analysis large datasets resulting high-throughput experiments thus assisting interpretation their results.

参考文章(1)
Rachael P Huntley, Midori A Harris, Yasmin Alam-Faruque, Judith A Blake, Seth Carbon, Heiko Dietze, Emily C Dimmer, Rebecca E Foulger, David P Hill, Varsha K Khodiyar, Antonia Lock, Jane Lomax, Ruth C Lovering, Prudence Mutowo-Meullenet, Tony Sawford, Kimberly Van Auken, Valerie Wood, Christopher J Mungall, A method for increasing expressivity of Gene Ontology annotations using a compositional approach. BMC Bioinformatics. ,vol. 15, pp. 155- 155 ,(2014) , 10.1186/1471-2105-15-155