Information extraction for knowledge base construction in the music domain

作者: Sergio Oramas , Luis Espinosa-Anke , Mohamed Sordo , Horacio Saggion , Xavier Serra

DOI: 10.1016/J.DATAK.2016.06.001

关键词: Information extractionSemantic WebKnowledge baseNatural languageSemanticsCluster analysisInformation retrievalRelationship extractionComputer scienceEntity linking

摘要: The rate at which information about music is being created and shared on the web growing exponentially. However, challenge of making sense all this data remains an open problem. In paper, we present evaluate Information Extraction pipeline aimed construction a Music Knowledge Base. Our approach starts off by collecting thousands stories songs from songfacts.com website. Then, combine state-of-the-art Entity Linking tool linguistically motivated rule-based algorithm to extract semantic relations between entity pairs. Next, with similar semantics are grouped into clusters exploiting syntactic dependencies. These ranked thanks novel confidence measure based statistical linguistic evidence. Evaluation carried out intrinsically, assessing each component pipeline, as well in extrinsic task, contribution natural language explanations recommendation. We demonstrate that our method able discover facts high precision, missing current generic music-specific knowledge repositories. A system constructs Base entirely scratch.A for clustering scoring Relation pipeline.Reveals absent repositories (e.g. Wikipedia).Explains recommendations language.

参考文章(41)
Pablo Gamallo, Marcos Garcia, Santiago Fernández-Lanza, Dependency-Based Open Information Extraction Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP. pp. 10- 18 ,(2012)
Sergio Oramas, Mohamed Sordo, Xavier Serra, Automatic Creation of Knowledge Graphs from Digital Musical Document Libraries Conference in Interdisciplinary Musicology. ,(2014)
Luis Espinosa-Anke, Horacio Saggion, Applying dependency relations to definition extraction applications of natural language to data bases. pp. 63- 74 ,(2014) , 10.1007/978-3-319-07983-7_10
Mohamed Sordo, Sergio Oramas, Luis Espinosa-Anke, Extracting Relations from Unstructured Text Sources for Music Recommendation applications of natural language to data bases. pp. 369- 382 ,(2015) , 10.1007/978-3-319-19581-0_33
Chitta Baral, Giuseppe De Giacomo, Knowledge representation and reasoning: what's hot national conference on artificial intelligence. pp. 4316- 4317 ,(2015)
Fabian Suchanek, Gerhard Weikum, Ndapandula Nakashole, PATTY: A Taxonomy of Relational Patterns with Semantic Types empirical methods in natural language processing. pp. 1135- 1145 ,(2012)
Michael J. Cafarella, Oren Etzioni, Stephen Soderland, Michele Banko, Matt Broadhead, Open information extraction from the web international joint conference on artificial intelligence. pp. 2670- 2676 ,(2007)
Lucien Tesnière, Éléments de syntaxe structurale Klincksieck. ,(1959)
Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sören Auer, Christian Bizer, DBpedia - A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia Social Work. ,vol. 6, pp. 167- 195 ,(2015) , 10.3233/SW-140134
Sergio Oramas, Francisco Gómez, Joaquín Mora, Emilia Gómez, Flabase: towards the creation of a flamenco music knowledge base international symposium/conference on music information retrieval. pp. 378- 384 ,(2015)