A Comparative Study on Feature Selection in Unbalance Text Classification

作者: Yan Xu

DOI: 10.1109/ISISE.2012.19

关键词: Dimensionality reductionEffective methodComputer scienceMutual informationMachine learningFeature selectionPattern recognitionFeature extractionArtificial intelligenceText miningCross entropyEntropy (information theory)

摘要: … of feature selection algorithms in unbalanced text categorization. We focus on study the effect of feature … effect of each feature selection method on corpus of different unbalance degree. …

参考文章(10)
Jin-Shu SU, Advances in Machine Learning Based Text Categorization Journal of Software. ,vol. 17, pp. 1848- ,(2006) , 10.1360/JOS171848
Marko Grobelnik, Dunja Mladenic, Feature Selection for Unbalanced Class Distribution and Naive Bayes international conference on machine learning. pp. 258- 267 ,(1999)
Wenqian Shang, Research on the Algorithm of Feature Selection Based on Gini Index for Text Categorization Journal of Computer Research and Development. ,vol. 43, pp. 1688- ,(2006) , 10.1360/CRAD20061002
Lei Yu, Chris Ding, Steven Loscalzo, Stable feature selection via dense feature groups Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 08. pp. 803- 811 ,(2008) , 10.1145/1401890.1401986
Nitesh V. Chawla, Nathalie Japkowicz, Aleksander Kotcz, Editorial ACM SIGKDD Explorations Newsletter. ,vol. 6, pp. 1- 6 ,(2004) , 10.1145/1007730.1007733
Anirban Dasgupta, Petros Drineas, Boulos Harb, Vanja Josifovski, Michael W. Mahoney, Feature selection methods for text classification knowledge discovery and data mining. pp. 230- 239 ,(2007) , 10.1145/1281192.1281220
Willie Ng, Manoranjan Dash, An Evaluation of Progressive Sampling for Imbalanced Data Sets international conference on data mining. pp. 657- 661 ,(2006) , 10.1109/ICDMW.2006.28
Yiming Yang, Jan O. Pedersen, A Comparative Study on Feature Selection in Text Categorization international conference on machine learning. pp. 412- 420 ,(1997)