Measuring Relatedness Between Scientific Entities in Annotation Datasets

作者： Guillermo Palma , Maria-Esther Vidal , Eric Haag , Louiqa Raschid , Andreas Thor

关键词: Linked data 、 ENCODE 、 Graph (abstract data type) 、 Computer science 、 Exploit 、 Semantic similarity 、 Controlled vocabulary 、 Ontology (information science) 、 Information retrieval 、 Annotation

摘要: Linked Open Data has made available a diversity of scientific collections where scientists have annotated entities in the datasets with controlled vocabulary terms (CV terms) from ontologies. These semantic annotations encode knowledge which is captured annotation datasets. One can mine these to discover relationships and patterns between entities. Determining relatedness (or similarity) becomes building block for graph pattern mining, e.g., identifying drug-drug could depend on similarity diseases (conditions) that are associated each drug. Diverse metrics been proposed literature, i) string-similarity metrics; ii) path-similarity iii) topological-similarity all measure given taxonomy or ontology. In this paper, we consider novel metric AnnSim measures two their annotations. We model as 1-to-1 maximal weighted bipartite match, exploit properties existing solvers provide an efficient solution. empirically study effectiveness real-world genes GO annotations, clinical trials, human disease benchmark. Our results suggest deeper understanding concepts explanation potential patterns.

uni-trier.de 本地加速

doi.org 本地加速

uni-leipzig.de PDF 下载加速

acm.org LINK 下载加速

sci-hub.se PDF 下载加速

参考文章(26)

Harish Karnick, Sumit Bhagwani, Shrutiranjan Satapathy, Semantic textual similarity using maximal weighted bipartite graph matching joint conference on lexical and computational semantics. pp. 579- 585 ,(2012)

Joseph Benik, Caren Chang, Louiqa Raschid, Maria-Esther Vidal, Guillermo Palma, Andreas Thor, Finding Cross Genome Patterns in Annotation Graphs Lecture Notes in Computer Science. pp. 21- 36 ,(2012) , 10.1007/978-3-642-31040-9_3

Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S. Yu, Tianyi Wu, PathSim Proceedings of the VLDB Endowment. ,vol. 4, pp. 992- 1003 ,(2011) , 10.14778/3402707.3402736

Toralf Kirsten, Erhard Rahm, Andreas Thor, Instance-based matching of hierarchical ontologies. BTW. pp. 436- 448 ,(2007)

Serguei V.S. Pakhomov, Ted Pedersen, Bridget T. McInnes, UMLS-Interface and UMLS-Similarity : open source software for measuring paths and semantic similarity. american medical informatics association annual symposium. ,vol. 2009, pp. 431- 435 ,(2009)

Genevieve B. Melton, Serguei Pakhomov, Ted Pedersen, Bridget McInnes, Terrence Adam, Ying Liu, Semantic Similarity and Relatedness between Clinical Terms: An Experimental Study american medical informatics association annual symposium. ,vol. 2010, pp. 572- 576 ,(2010)

Dekang Lin, An Information-Theoretic Definition of Similarity international conference on machine learning. pp. 296- 304 ,(1998)

Schema Matching and Mapping smm. pp. 320- ,(2013) , 10.1007/978-3-642-16518-4

Michael A. Bender, Martín Farach-Colton, Giridhar Pemmasani, Steven Skiena, Pavel Sumazin, Lowest common ancestors in trees and directed acyclic graphs Journal of Algorithms. ,vol. 57, pp. 75- 94 ,(2005) , 10.1016/J.JALGOR.2005.08.001

10.

David Aumueller, Hong-Hai Do, Sabine Massmann, Erhard Rahm, Schema and ontology matching with COMA++ Proceedings of the 2005 ACM SIGMOD international conference on Management of data - SIGMOD '05. pp. 906- 908 ,(2005) , 10.1145/1066157.1066283

Measuring Relatedness Between Scientific Entities in Annotation Datasets

来源期刊

我的账户

Measuring Relatedness Between Scientific Entities in Annotation Datasets

来源期刊

相似文章 10

我的账户