作者: Leif Grönqvist
DOI:
关键词:
摘要: This thesis deals with a kind of vector space model called “Latent Semantic Vector Model”, or LSVM, calculated by the technique Indexing”. An LSVM can be used for many things, but I have mainly looked at one direct application: document retrieval. What we gain from an is possibility searching content rather than specific keywords. Using in retrieval system has been shown to improve quality returned lists, which makes it easier user find information he she wants. The problem attacked this that normal case contains just single words, while terms searches cases are multi-word expressions.LSVMs trained various parameter settings training data, vocabulary, matrix size, context and last not least, different ways include expressions directly into models. aim determine how performance changes when go word-based containing both words expressions. To able measure changes, two evaluation methods used: synonym tests Synonym testing performed Swedish English. results improved added test task, change worse For English, latter significant.This work also resulted new resources, well suited models: set SweHP560, 560 queries “Hogskoleprovet”, metrics RankEff WRS evaluation, handle incomplete gold standard better way existing like MAP bpref.