作者: Roi Blanco , Álvaro Barreiro
DOI: 10.1007/978-3-540-71496-5_9
关键词: Pruning (decision trees) 、 Reduction (complexity) 、 Data mining 、 Inverted index 、 Computer science
摘要: This paper addresses the problem of identifying collection dependent stop-words in order to reduce size inverted files. We present four methods automatically recognise stop-words, analyse tradeoff between efficiency and effectiveness, compare them with a previous pruning approach. The experiments allow us conclude that some situations is competitive respect other file reduction techniques.