An Analysis of Case-Base Editing in a Spam Filtering System

作者: Sarah Jane Delany , Pádraig Cunningham

DOI: 10.1007/978-3-540-28631-8_11

关键词: Computer scienceCase baseRedundancy (engineering)Electronic mailMachine learningConcept driftArtificial intelligence

摘要: Because of the volume spam email and its evolving nature, any deployed Machine Learning- based filtering system will need to have procedures for case-base maintenance. Key this be edit remove noise eliminate redundancy. In paper we present a two stage process do this. We new reduction algorithm called Blame-Based Noise Reduction that removes cases are observed cause misclassification. also an Conservative Redundancy is much less aggressive than state-of-the-art alternatives has significantly better generalisation performance in domain. These techniques evaluated against literature on four datasets 1000 emails each (50% 50% non spam).

参考文章(30)
Niamh Nowlan, Mads Haahr, Pádraig Cunningham, Sarah Jane Delany, A Case-Based Approach to Spam Filtering that Can Track Concept Drift ,(2003)
Tim Niblett, Constructing Decision Trees in Noisy Domains. EWSL. pp. 67- 78 ,(1987)
John Cuadrado, Maciej Ceglowski, Aaron Coburn, Semantic Search of Unstructured Data using Contextual Network Graphs ,(2003)
Geoffrey W. Gates, The Reduced Nearest Neighbor Rule ,(1998)
Barry Smyth, Elizabeth McKenna, Competence-guided editing methods for lazy learning european conference on artificial intelligence. pp. 60- 64 ,(2000)
Jianping Zhang, Selecting typical instances in instance-based learning international conference on machine learning. pp. 470- 479 ,(1992) , 10.1016/B978-1-55860-247-2.50066-8
Steven L. Salzberg, Alberto Segre, Programs for Machine Learning ,(1994)
Barry Smyth, Elizabeth McKenna, Modelling the Competence of Case-Bases Lecture Notes in Computer Science. pp. 208- 220 ,(1998) , 10.1007/BFB0056334
Panagiotis Stamatopoulos, Ion Androutsopoulos, Vangelis Karkaletsis, Constantine D. Spyropoulos, Georgios Sakkis, Georgios Paliouras, A Memory-Based Approach to Anti-Spam Filtering ,(2001)
Carla E. Brodley, Addressing the Selective Superiority Problem: Automatic Algorithm/Model Class Selection Machine Learning Proceedings 1993#R##N#Proceedings of the Tenth International Conference, University of Massachusetts, Amherst, June 27–29, 1993. pp. 17- 24 ,(1993) , 10.1016/B978-1-55860-307-3.50009-5