作者: Yoshihiko Suhara , Hiroyuki Toda , Shuichi Nishioka , Seiji Susaki
关键词:
摘要: Spammers use a wide range of content generation techniques with low quality pages known as spam to achieve their goals. We argue that must be tackled using features. In this paper, we propose novel sentence-level diversity features based on the probabilistic topic model. combine them other build classifier. Our experiments show our method outperforms conventional methods.