作者: Dmitri Alperovitch , Yuchun Tang , Paul Judge , Sven Krasser
DOI:
关键词:
摘要: Unsolicited commercial or bulk emails containing viruses pose a great threat to the utility of email communications. A recent solution for filtering is reputation systems that can assign value trust each IP address sending messages. By analyzing query patterns node utilizing information, calculate score queried address. In this research, we explore behavioral classification approach based on features extracted from such global messaging patterns. Due large amount bad senders, task has cope with highly imbalanced data. Firstly, observed sender, periodicity properties using discrete Fourier transform and breadth information reflecting message volume recipient distribution. After that, Granular Support Vector Machine - Boundary Alignment algorithm (GSVM-BA) implemented solve class imbalance problem compared cost sensitive learning. Lastly, determine performance support vector machine, C4.5 decision trees, na¨ ive Bayesian multinomial logistic regression classifiers resulting data set. The best by GSVM-BA rebalance then SVM classification.