作者: Burton Wu
DOI:
关键词: Clustering high-dimensional data 、 Telecommunications 、 Purchasing 、 Engineering 、 Markov chain Monte Carlo 、 Data mining 、 Cluster analysis 、 Bayesian probability 、 Timestamp 、 Profiling (information science) 、 Mixture model
摘要: This thesis investigates profiling and differentiating customers through the use of statistical data mining techniques. The business application our work centres on examining individuals’ seldomly studied yet critical consumption behaviour over an extensive time period within context wireless telecommunication industry; (as oppose to purchasing behaviour) is that has been performed so frequently it become habitual involves minimal intentions or decision making. Key variables investigated are activity initialised timestamp cell tower location as well type usage quantity (e.g., voice call with duration in seconds); research focuses customers’ spatial temporal behaviour. main methodological emphasis development clustering models based Gaussian mixture (GMMs) which fitted recently developed variational Bayesian (VB) method. VB efficient deterministic alternative popular but computationally demandingMarkov chainMonte Carlo (MCMC) methods. standard VBGMMalgorithm extended by allowing component splitting such robust initial parameter choices can automatically efficiently determine number components. new algorithm we propose allows more effective modelling highly heterogeneous spiky behaviour, generally human mobility patterns; term describes patterns large areas low probability mixed small high probability. Customers then characterised segmented GMM corresponds how each them uses products/services spatially their daily lives; this essentially likely lifestyle occupational traits. Other significant contributions include fitting GMMs using circular i.e., developing algorithms suitable for dimensional VB-GMM.