Clustering based feature selection using Extreme Learning Machines for text classification

作者: Rajendra Kumar Roul , Shashank Gugnani , Shah Mit Kalpeshbhai , None

DOI: 10.1109/INDICON.2015.7443788

关键词: Feature (machine learning)Linear classifierFeature selectionRandom subspace methodExtreme learning machineMachine learningPattern recognitionFeature learningArtificial intelligenceComputer scienceFeature vectorCluster analysis

摘要: The expansion of the dynamic Web increases digital documents, which has attracted many researchers to work in field text classification. It is an important and well studied area machine learning with a variety modern applications. A good feature selection paramount importance increase efficiency classifiers working on data. Choosing most relevant features out what can be incredibly large set data, particularly for accurate This paper motivation that direction where we propose new clustering based technique reduces size. Traditional k-means along TF-IDF Wordnet helps us form quality reduced vector train Extreme Learning Machine (ELM) Multi-layer ELM (ML-ELM) have been used as experimental carried 20-Newsgroups DMOZ datasets. Results these two standard datasets demonstrate our approach using ML-ELM over state-of-the-art classifiers.

参考文章(16)
Xipeng Qiu, Jinlong Zhou, Xuanjing Huang, None, An Effective Feature Selection Method for Text Categorization Advances in Knowledge Discovery and Data Mining. pp. 50- 61 ,(2011) , 10.1007/978-3-642-20841-6_5
Zhi-Hong Deng, Shi-Wei Tang, Dong-Qing Yang, Ming Zhang, Li-Yu Li, Kun-Qing Xie, A Comparative Study on Feature Weight in Text Categorization asia-pacific web conference. pp. 588- 597 ,(2004) , 10.1007/978-3-540-24655-8_64
Charu C. Aggarwal, ChengXiang Zhai, A survey of text classification algorithms Mining Text Data. pp. 163- 222 ,(2012) , 10.1007/978-1-4614-3223-4_6
Anton Akusok, Rui Nian, Victor C.M. Leung, Amaury Lendasse, Sergio Decherchi, Andrew Beng Jin Teoh, Paolo Gastaldo, Liyanaarachchi Lekamalage Chamara Kasun, Liang Feng, Jaihie Kim, Guang-Bin Huang, Junfa Liu, Jiarun Lin, Chi Man Vong, Yew-Soon Ong, Francesco Corona, Kar-Ann Toh, Yiqiang Chen, Jianping Yin, Rodolfo Zunino, Hanchao Yu, Jehyoung Jeon, Beom-Seok Oh, Xuefeng Yang, Kezhi Mao, Meng-Hiot Lim, Hongming Zhou, Zhiping Cai, Yoan Miche, Qiang Liu, Erik Cambria, Kuan Li, Extreme Learning Machine ,(2013)
George Forman, An extensive empirical study of feature selection metrics for text classification Journal of Machine Learning Research. ,vol. 3, pp. 1289- 1305 ,(2003)
Guang-Bin Huang, Qin-Yu Zhu, Chee-Kheong Siew, Extreme learning machine: Theory and applications Neurocomputing. ,vol. 70, pp. 489- 501 ,(2006) , 10.1016/J.NEUCOM.2005.12.126
Fabrizio Sebastiani, Machine learning in automated text categorization ACM Computing Surveys. ,vol. 34, pp. 1- 47 ,(2002) , 10.1145/505282.505283
Anirban Dasgupta, Petros Drineas, Boulos Harb, Vanja Josifovski, Michael W. Mahoney, Feature selection methods for text classification knowledge discovery and data mining. pp. 230- 239 ,(2007) , 10.1145/1281192.1281220
Isabelle Guyon, André Elisseeff, An introduction to variable and feature selection Journal of Machine Learning Research. ,vol. 3, pp. 1157- 1182 ,(2003) , 10.1162/153244303322753616
Corinna Cortes, Vladimir Vapnik, Support-Vector Networks Machine Learning. ,vol. 20, pp. 273- 297 ,(1995) , 10.1023/A:1022627411411