From web directories to ontologies: natural language processing challenges

作者: Ilya Zaihrayeu , Lei Sun , Fausto Giunchiglia , Wei Pan , Qi Ju

DOI: 10.1007/978-3-540-76298-0_45

关键词: Natural language processingWorld Wide WebHeuristicNatural languageArtificial intelligenceConditional random fieldSoftware agentDomain (software engineering)Ontology (information science)Process (engineering)Computer scienceWord-sense disambiguationSemantic Web

摘要: Hierarchical classifications are used pervasively by humans as a means to organize their data and knowledge about the world. One of main advantages is that natural language labels, describe contents, easily understood human users. However, at same time, this also one disadvantages these labels ambiguous very hard be reasoned software agents. This fact creates an insuperable hindrance for being embedded in Semantic Web infrastructure. paper presents approach converting into lightweight ontologies, it makes following contributions: (i) identifies NLP problems related conversion process shows how they different from classical NLP; (ii) proposes heuristic solutions problems, which especially effective domain; (iii) evaluates proposed testing them on DMoz data.

参考文章(28)
Vincent J. Della Pietra, Adam L. Berger, Stephen A. Della Pietra, A maximum entropy approach to natural language processing Computational Linguistics. ,vol. 22, pp. 39- 71 ,(1996) , 10.5555/234285.234289
Alan Rector, Nick Drummond, Matthew Horridge, Jeremy Rogers, Holger Knublauch, Robert Stevens, Hai Wang, Chris Wroe, OWL Pizzas: Practical Experience of Teaching OWL-DL: Common Errors & Common Patterns knowledge acquisition, modeling and management. pp. 63- 81 ,(2004) , 10.1007/978-3-540-30202-5_5
Che-Yu Yang, J.C. Hung, Word Sense Determination using WordNet and Sense Co-occurrence advanced information networking and applications. ,vol. 1, pp. 779- 784 ,(2006) , 10.1109/AINA.2006.353
L.R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE. ,vol. 77, pp. 267- 296 ,(1989) , 10.1109/5.18626
John D. Lafferty, Andrew McCallum, Fernando C. N. Pereira, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data international conference on machine learning. pp. 282- 289 ,(2001)
Marc Tilbrook, Rolf Schwitter, Let's talk in description logic via controlled natural language logic and engineering natural language semantics. pp. 193- 207 ,(2006)
German Rigau, Eneko Agirre, A Proposal for Word Sense Disambiguation using Conceptual Distance recent advances in natural language processing. ,(1995)
Ora Lassila, Tim Berners-lee, James A. Hendler, The Semantic Web" in Scientific American ,(2001)
Abraham Bernstein, Esther Kaufmann, GINO – a guided input natural language ontology editor international semantic web conference. pp. 144- 157 ,(2006) , 10.1007/11926078_11
Deborah L McGuinness, Frank Van Harmelen, None, OWL Web ontology language overview W3C Recommendation. ,(2004)