New variational Bayesian approaches for statistical data mining : with applications to profiling and differentiating habitual consumption behaviour of customers in the wireless telecommunication industry

作者: Burton Wu

DOI:

关键词: Clustering high-dimensional dataTelecommunicationsPurchasingEngineeringMarkov chain Monte CarloData miningCluster analysisBayesian probabilityTimestampProfiling (information science)Mixture model

摘要: This thesis investigates profiling and differentiating customers through the use of statistical data mining techniques. The business application our work centres on examining individuals’ seldomly studied yet critical consumption behaviour over an extensive time period within context wireless telecommunication industry; (as oppose to purchasing behaviour) is that has been performed so frequently it become habitual involves minimal intentions or decision making. Key variables investigated are activity initialised timestamp cell tower location as well type usage quantity (e.g., voice call with duration in seconds); research focuses customers’ spatial temporal behaviour. main methodological emphasis development clustering models based Gaussian mixture (GMMs) which fitted recently developed variational Bayesian (VB) method. VB efficient deterministic alternative popular but computationally demandingMarkov chainMonte Carlo (MCMC) methods. standard VBGMMalgorithm extended by allowing component splitting such robust initial parameter choices can automatically efficiently determine number components. new algorithm we propose allows more effective modelling highly heterogeneous spiky behaviour, generally human mobility patterns; term describes patterns large areas low probability mixed small high probability. Customers then characterised segmented GMM corresponds how each them uses products/services spatially their daily lives; this essentially likely lifestyle occupational traits. Other significant contributions include fitting GMMs using circular i.e., developing algorithms suitable for dimensional VB-GMM.

参考文章(585)
Ebi Marandi, Edward Little, Relationship Marketing Management Thomson Learning. ,(2003)
D. Littau, D. Boley, Clustering very large data sets with principal direction divisive partitioning Grouping Multidimensional Data. pp. 99- 126 ,(2006) , 10.1007/3-540-28349-8_4
Sudipto Guha, Kyuseok Shim, Chulyun Kim, XWAVE: Approximate Extended Wavelets for Streaming Data. very large data bases. pp. 288- 299 ,(2004)
David Meer, Daniel Yankelovich, Rediscovering market segmentation. Harvard Business Review. ,vol. 84, pp. 122- 166 ,(2006)
Murray Aitkin, Donald B. Rubin, Estimation and Hypothesis Testing in Finite Mixture Models Journal of the royal statistical society series b-methodological. ,vol. 47, pp. 67- 75 ,(1985) , 10.1111/J.2517-6161.1985.TB01331.X
Charu C. Aggarwal, An Introduction to Data Streams Data Streams - Models and Algorithms. pp. 1- 8 ,(2007) , 10.1007/978-0-387-47534-9_1
PHIPPS ARABIE, LAWRENCE J. HUBERT, AN OVERVIEW OF COMBINATORIAL DATA ANALYSIS WORLD SCIENTIFIC. pp. 5- 63 ,(1996) , 10.1142/9789812832153_0002
Neil H. Borden, The Concept of the Marketing Mix ,(1964)
Douglas Fisher, Improving inference through conceptual clustering national conference on artificial intelligence. pp. 461- 465 ,(1987)
Adrian Corduneanu, Christopher M. Bishop, Variational Bayesian Model Selection for Mixture Distributions Morgan Kaufmann. pp. 27- 34 ,(2001)