PCA-Guided k-Means with Variable Weighting and Its Application to Document Clustering

作者: Katsuhiro Honda , Akira Notsu , Hidetomo Ichihashi

DOI: 10.1007/978-3-642-04820-3_26

关键词:

摘要: PCA-guided k -Means is a deterministic approach to clustering, in which cluster indicators are derived manner. This paper proposes new with variable selection by introducing weighting mechanism into -Means. The relative responsibility of variables estimated similar way FCM clustering while the membership indicator from manner, principal component scores calculated considering weights variables. So, that have meaningful information for capturing structures emphasized calculation indicators. Numerical experiments including an application document demonstrate characteristics proposed method.

参考文章(12)
Katsuhiro Honda, Hidetomo Ichihashi, Francesco Masulli, Stefano Rovetta, Linear Fuzzy Clustering With Selection of Variables Using Graded Possibilistic Approach IEEE Transactions on Fuzzy Systems. ,vol. 15, pp. 878- 889 ,(2007) , 10.1109/TFUZZ.2006.889946
Yutaka Tanaka, Yuichi Mori, Principal component analysis based on a subset of variables: variable selection and sensitivity analysis American Journal of Mathematical and Management Sciences. ,vol. 17, pp. 61- 89 ,(1997) , 10.1080/01966324.1997.10737430
Chris Ding, Xiaofeng He, Linearized cluster assignment via spectral ordering Twenty-first international conference on Machine learning - ICML '04. pp. 30- ,(2004) , 10.1145/1015330.1015407
Chris Ding, Xiaofeng He, K-means clustering via principal component analysis Twenty-first international conference on Machine learning - ICML '04. pp. 29- ,(2004) , 10.1145/1015330.1015408
Joshua Zhexue Huang, Michael K Ng, Hongqiang Rong, Zichen Li, Automated variable weighting in k-means type clustering IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 27, pp. 657- 668 ,(2005) , 10.1109/TPAMI.2005.95
J. B. Macqueen, Some methods for classification and analysis of multivariate observations Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics. ,vol. 1, pp. 281- 297 ,(1967)
K. Honda, H. Ichihashi, Regularized linear fuzzy clustering and probabilistic PCA mixture models IEEE Transactions on Fuzzy Systems. ,vol. 13, pp. 508- 516 ,(2005) , 10.1109/TFUZZ.2004.840104
K. Honda, H. Ichihashi, Linear fuzzy clustering techniques with missing values and their application to local principal component analysis IEEE Transactions on Fuzzy Systems. ,vol. 12, pp. 183- 193 ,(2004) , 10.1109/TFUZZ.2004.825073
Xiaofeng He, Hongyuan Zha, Horst D. Simon, Chris Ding, Ming Gu, Spectral Relaxation for K-means Clustering neural information processing systems. ,vol. 14, pp. 1057- 1064 ,(2001)