A word stemming algorithm for the Spanish language

作者: A. Honrado , R. Leon , R. O'Donnel , D. Sinclair

DOI: 10.1109/SPIRE.2000.878189

关键词:

摘要: The paper describes a word stemming algorithm for the Spanish language. Experiments in document retrieval regarding English text suggest that based on morphological analysis does not generally or consistently outperform ad-hoc hand tuned algorithms such as proposed by M. Porter (1980). It is difficult to produce style romantic language Spanish, however due greater grammatical complexity and fact inflection often causes changes root of words, just their endings (as mostly case with English). In general terms, difficulty consists producing an which can cope additional morphology whilst preserving simplicity algorithm. One presented. combines dictionary look-ups some 300 intermediate reduction rules.

参考文章(8)
D K Harman, The fourth text REtrieval conference National Institute of Standards and Technology. ,(1996) , 10.6028/NIST.SP.500-236
H. Schütze, G. Grefenstette, D. A. Hull, B. M. Schulze, J. O. Pedersen, E. Gaussier, Xerox TREC-5 site report : Routing, filtering, NLP, and Spanish tracks text retrieval conference. pp. 167- 180 ,(1996)
Fergus Kelledy, Ruairi O'Donnell, Alan F. Smeaton, TREC-4 experiments at Dublin City University : Thresholding posting lists, query expansion with WordNet and POS tagging of Spanish text retrieval conference. pp. 373- 389 ,(1995)
Mark W. Davis, New experiments in cross-language text retrieval at NMSU'S computing research lab text retrieval conference. pp. 447- 453 ,(1996)
Fredric C. Gey, Aitao Chen, Jianzhang He, Jason Meggs, Liangjie Xu, Term importance, Boolean conjunct training, negative terms, and foreign language retrieval: probabilistic algorithms at TREC-5. text retrieval conference. pp. 181- 190 ,(1996)
M.F. Porter, An algorithm for suffix stripping Program: Electronic Library and Information Systems. ,vol. 40, pp. 313- 316 ,(1997) , 10.1108/EB046814
Donna K. Harman, Overview of the Fourth Text REtrieval Conference (TREC-4) text retrieval conference. pp. 1- 23 ,(1996)
Donna K Harman, None, Overview of the third text Retrieval conference (TREC-3) text retrieval conference. pp. 1- 19 ,(1995) , 10.6028/NIST.SP.500-225