On the effect of stopword removal for SMS-Based FAQ retrieval

作者: Johannes Leveling

DOI: 10.1007/978-3-642-31178-9_12

关键词:

摘要: This paper investigates the effects of stopword removal in different stages a system for SMS-based FAQ retrieval. Experiments are performed on FIRE 2011 monolingual English data. The comprises several stages, including normalization and correction SMS, retrieval FAQs potentially containing answers using BM25 model, detection out-of-domain queries based k nearest-neighbor classifier. Both OOD tested with lists. Results indicate that i) performance is highest when stopwords not removed decreases longer lists employed, ii) accuracy trained features collected during no stopwords, iii) combination SMART yields best results: 75.1% in-domain answered correctly 85.6% detected correctly.

参考文章(11)
James Lanagan, Owen Phelan, Neil O'Hare, Barry Smyth, Paul Ferguson, Kevin McCarthy, Alan F. Smeaton, CLARITY at the TREC 2011 Microblog Track ,(2011)
Caroline Tagg, A corpus linguistics study of SMS text messaging University of Birmingham. ,(2009)
Mike Gatford, Micheline Hancock-Beaulieu, Susan Jones, Stephen E. Robertson, Steve Walker, Okapi at TREC text retrieval conference. pp. 109- 123 ,(1994)
Christopher Fox, A stop list for general text international acm sigir conference on research and development in information retrieval. ,vol. 24, pp. 19- 21 ,(1989) , 10.1145/378881.378888
Govind Kothari, Sumit Negi, Tanveer A. Faruquie, Venkatesan T. Chakaravarthy, L. Venkata Subramaniam, SMS based Interface for FAQ Retrieval international joint conference on natural language processing. pp. 852- 860 ,(2009) , 10.3115/1690219.1690266
Iadh Ounis, Ben He, Rachel Tsz-Wai Lo, Automatically Building a Stopword List for an Information Retrieval System. Journal of Digital Information Management. ,vol. 3, pp. 3- 8 ,(2005)
Song Han, Fu Lee Wang, Lu Sheng Wang, Feng Zou, Xiaotie Deng, Automatic construction of Chinese stop word list ACOS'06 Proceedings of the 5th WSEAS international conference on Applied computer science. pp. 1009- 1014 ,(2006)
Ljiljana Dolamic, Jacques Savoy, When stopword lists make the difference Journal of the Association for Information Science and Technology. ,vol. 61, pp. 200- 203 ,(2010) , 10.1002/ASI.V61:1
Ibrahim Abu El-Khair, Effects of Stop Words Elimination for Arabic Information Retrieval: A Comparative Study arXiv: Computation and Language. ,(2006)