作者: Johannes Leveling
DOI: 10.1007/978-3-642-31178-9_12
关键词:
摘要: This paper investigates the effects of stopword removal in different stages a system for SMS-based FAQ retrieval. Experiments are performed on FIRE 2011 monolingual English data. The comprises several stages, including normalization and correction SMS, retrieval FAQs potentially containing answers using BM25 model, detection out-of-domain queries based k nearest-neighbor classifier. Both OOD tested with lists. Results indicate that i) performance is highest when stopwords not removed decreases longer lists employed, ii) accuracy trained features collected during no stopwords, iii) combination SMART yields best results: 75.1% in-domain answered correctly 85.6% detected correctly.