DIOPT: Extremely Fast Classification Using Lookups and Optimal Feature Discretization

作者: Johan Garcia , Topi Korhonen

DOI: 10.1109/IJCNN48605.2020.9207037

关键词:

摘要: For low dimensional classification problems we propose the novel DIOPT approach which considers construction of a discretized feature space. Predictions for all cells in this space are obtained by means reference classifier and class labels stored lookup table generated enumerating complete This then leads to extremely high throughput as inference consists only discretizing relevant features reading label from index corresponding concatenation bin indices. Since size is limited due memory constraints, selection optimal their respective discretization levels paramount. We particular supervised striving achieve maximal separation features, further employ purpose-built memetic algorithm search towards levels. The run time accuracy compared benchmark random forest decision tree classifiers several publicly available data sets. Orders magnitude improvements recorded runtime with insignificant or modest degradation many evaluated binary tasks.

参考文章(37)
Kirill Trapeznikov, Venkatesh Saligrama, Supervised Sequential Classification Under Budget Constraints international conference on artificial intelligence and statistics. pp. 581- 589 ,(2013)
Mario Barbareschi, Salvatore Del Prete, Francesco Gargiulo, Antonino Mazzeo, Carlo Sansone, Decision Tree-Based Multiple Classifier Systems: An FPGA Perspective multiple classifier systems. pp. 194- 205 ,(2015) , 10.1007/978-3-319-20248-8_17
Steven L. Salzberg, Alberto Segre, Programs for Machine Learning ,(1994)
Foster Provost, R Fawcett, T, Kohavi, The Case against Accuracy Estimation for Comparing Induction Algorithms international conference on machine learning. pp. 445- 453 ,(1998)
Jerome H. Friedman, Greedy function approximation: A gradient boosting machine. Annals of Statistics. ,vol. 29, pp. 1189- 1232 ,(2001) , 10.1214/AOS/1013203451
Geoffrey Hinton, Oriol Vinyals, Jeff Dean, Distilling the Knowledge in a Neural Network arXiv: Machine Learning. ,(2015)
CE Shennon, Warren Weaver, A mathematical theory of communication Bell System Technical Journal. ,vol. 27, pp. 379- 423 ,(1948) , 10.1002/J.1538-7305.1948.TB01338.X
Fan Yang, Wei-hang Lu, Lin-kai Luo, Tao Li, Margin optimization based pruning for random forest Neurocomputing. ,vol. 94, pp. 54- 63 ,(2012) , 10.1016/J.NEUCOM.2012.04.007
Yun R. Qu, Viktor K. Prasanna, Scalable and dynamically updatable lookup engine for decision-trees on FPGA ieee high performance extreme computing conference. pp. 1- 6 ,(2014) , 10.1109/HPEC.2014.7040952
Salvador Garcia, J. Luengo, José Antonio Sáez, Victoria López, F. Herrera, A Survey of Discretization Techniques: Taxonomy and Empirical Analysis in Supervised Learning IEEE Transactions on Knowledge and Data Engineering. ,vol. 25, pp. 734- 750 ,(2013) , 10.1109/TKDE.2012.35