A comparative analysis of data preparation algorithms for customer churn prediction

作者: Kristof Coussement , Stefan Lessmann , Geert Verstraeten

DOI: 10.1016/J.DSS.2016.11.007

关键词:

摘要: Data preparation is a process that aims to convert independent (categorical and continuous) variables into form appropriate for further analysis. We examine data-preparation alternatives enhance the prediction performance commonly-used logit model. This study, conducted in churn modeling context, benchmarks an optimized model against eight state-of-the-art data mining techniques use standard input data, including real-world cross-sectional from large European telecommunication provider. The results lead following conclusions. (i) Analysts better acknowledge technique they choose actually affects performance; we find improvements of up 14.5% area under receiving operating characteristics curve 34% top decile lift. (ii) enhanced logistic regression also competitive with more advanced single ensemble algorithms. article concludes some managerial implications suggestions research, evidence generalizability other business settings. study impact on customer performance.Effective improves AUC lift 34%.Optimized

参考文章(67)
Claudia Perlich, Vikas Sindhwani, Alexandru Niculescu-Mizil, Prem Melville, Grzegorz Swirszcz, Jianying Hu, Moninder Singh, Yan Liu, Dong Wang, Wei Xiong Shang, Yan Feng Zhu, Jing Xiao, Winning the KDD Cup Orange Challenge with ensemble selection knowledge discovery and data mining. pp. 23- 34 ,(2009)
Scott A. Neslin, Robert C. Blattberg, Pyŏng-do Kim, Database Marketing: Analyzing and Managing Customers ,(2008)
Kristof Coussement, Paul Harrigan, Dries F. Benoit, Improving direct mail targeting through customer response modeling Expert Systems With Applications. ,vol. 42, pp. 8403- 8412 ,(2015) , 10.1016/J.ESWA.2015.06.054
Robert Tibshirani, Trevor Hastie, Jerome H. Friedman, The Elements of Statistical Learning ,(2001)
Robert Tibshirani, Trevor Hastie, Daniela Witten, Gareth James, An Introduction to Statistical Learning: With Applications in R ,(2013)
A. E. Eiben, A. E. Koudijs, F. Slisser, Genetic Modelling of Customer Retention european conference on genetic programming. pp. 178- 186 ,(1998) , 10.1007/BFB0055937
Jonathan Crook, Credit Scoring and its Applications Society for Industrial and Applied Mathematics. ,(2000)
Piew Datta, Brij Masand, D. R. Mani, Bin Li, Automated Cellular Modeling and Prediction on a Large Scale Artificial Intelligence Review. ,vol. 14, pp. 485- 502 ,(2000) , 10.1023/A:1006643109702