Weighted proportional k-interval discretization for naive-Bayes classifiers

作者: Geoffrey I. Webb , Ying Yang

DOI: 10.5555/1760894.1760960

关键词:

摘要: The use of different discretization techniques can be expected to affect the classification bias and variance naive-Bayes classifiers. We call such an effect variance. Proportional k-interval (PKID) tunes by adjusting discretized interval size number proportional training instances. Theoretical analysis suggests that this is desirable for However PKID sub-optimal when learning from data small size. argue because equally weighs reduction reduction. But data, contribute more lower error thus should given greater weight than Accordingly we propose weighted (WPKID), which establishes a suitable trade-off while allowing additional used reduce both Our experiments demonstrate classifiers, WPKID improves upon smaller datasets with significant frequency; delivers significantly often not in comparison three other leading alternative studied.

参考文章(25)
Hung-Ju Huang, Tzu-Tsung Wong, Why Discretization Works for Naive Bayesian Classifiers international conference on machine learning. pp. 399- 406 ,(2000)
Bojan Cestnik, Estimating probabilities: a crucial task in machine learning european conference on artificial intelligence. pp. 147- 149 ,(1990)
Michael J. Pazzani, An iterative improvement approach for the discretization of numeric attributes in Bayesian classifiers knowledge discovery and data mining. pp. 228- 233 ,(1995)
João Gama, Luis Torgo, Carlos Soares, Dynamic Discretization of Continuous Attributes ibero american conference on ai. pp. 160- 169 ,(1998) , 10.1007/3-540-49795-1_14
Llanos Mora López, Inmaculada Fortes Ruiz, Rafael Morales Bueno, Francisco Triguero Ruiz, Dynamic Discretization of Continuous Values from Time Series european conference on machine learning. pp. 280- 291 ,(2000) , 10.1007/3-540-45164-1_30
Eun Bae Kong, Thomas G. Dietterich, Error-Correcting Output Coding Corrects Bias and Variance Machine Learning Proceedings 1995. pp. 313- 321 ,(1995) , 10.1016/B978-1-55860-377-6.50046-3
Steven L. Salzberg, Alberto Segre, Programs for Machine Learning ,(1994)
Ron Kohavi, David Wolpert, Bias plus variance decomposition for zero-one loss functions international conference on machine learning. pp. 275- 283 ,(1996)
Keki B. Irani, Usama M. Fayyad, Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning international joint conference on artificial intelligence. ,vol. 2, pp. 1022- 1027 ,(1993)