Exploring Latent Semantic Vector Models Enriched With N-grams

作者: Leif Grönqvist

DOI:

关键词:

摘要: This thesis deals with a kind of vector space model called “Latent Semantic Vector Model”, or LSVM, calculated by the technique Indexing”. An LSVM can be used for many things, but I have mainly looked at one direct application: document retrieval. What we gain from an is possibility searching content rather than specific keywords. Using in retrieval system has been shown to improve quality returned lists, which makes it easier user find information he she wants. The problem attacked this that normal case contains just single words, while terms searches cases are multi-word expressions.LSVMs trained various parameter settings training data, vocabulary, matrix size, context and last not least, different ways include expressions directly into models. aim determine how performance changes when go word-based containing both words expressions. To able measure changes, two evaluation methods used: synonym tests Synonym testing performed Swedish English. results improved added test task, change worse For English, latter significant.This work also resulted new resources, well suited models: set SweHP560, 560 queries “Hogskoleprovet”, metrics RankEff WRS evaluation, handle incomplete gold standard better way existing like MAP bpref.

参考文章(36)
Alan F. Smeaton, Fergus Kelledy, User-chosen phrases in interactive query formulation for information retrieval IRSG'98 Proceedings of the 20th Annual BCS-IRSG conference on Information Retrieval Research. pp. 11- 11 ,(1998) , 10.14236/EWIC/IRSG1998.11
Claire Cardie, Mandar Mitra, Chris Buckley, Amit Singhal, An analysis of statistical and syntactic phrases RIAO '97 Computer-Assisted Information Searching on Internet. pp. 200- 214 ,(1997)
Peter Wiemer-Hastings, Adding syntactic information to LSA Proceedings of the Annual Meeting of the Cognitive Science Society. ,vol. 22, ,(2000)
Magnus Sahlgren, An Introduction to Random Indexing terminology and knowledge engineering. ,(2005)
Magnus Merkel, Understanding and enhancing translation by parallel text processing Linköpings universitet. ,(1999)
Joakim Nivre, Inductive Dependency Parsing of Natural Language Text Växjö University. ,(2005)
Mike Gatford, Micheline Hancock-Beaulieu, Susan Jones, Stephen E. Robertson, Steve Walker, Okapi at TREC text retrieval conference. pp. 109- 123 ,(1994)
Ivan A. Sag, Timothy Baldwin, Francis Bond, Ann Copestake, Dan Flickinger, Multiword Expressions: A Pain in the Neck for NLP international conference on computational linguistics. pp. 1- 15 ,(2002) , 10.1007/3-540-45715-1_1
Iraide Zipitria, Peter Wiemer-Hastings, Rules for Syntax, Vectors for Semantics Proceedings of the Annual Meeting of the Cognitive Science Society. ,vol. 23, ,(2001)