作者: Alfredo Vellido , Paulo J. G. Lisboa , Karon Meehan
DOI:
关键词:
摘要: The process of extracting knowledge from data involves the discovery of patterns of interest which may be implicit, for instance, in specific clusters of data points. In the context of Internet retailing, finding clusters of typical consumer types is among the most important uses of data mining techniques. ClusterMbased market segmentation models, grounded on surveys of customer opinion, can give the online retailer a competitive edge, forming the basis for effective targeting and enabling the redirection of madeMtoMmeasure content towards the customer. The Generative Topographic Mapping (GTM) is proposed as a statistically principled technique for clusterMbased market segmentation. In this nonMlinear latent variable model, a posterior probability of cluster membership can be defined for each individual, providing a robust framework for the visualization of high dimensional data and the segmentation to different levels of granularity. The advantages of the GTM over the wellMknown SelfMOrganizing Map (SOM), to which it is an alternative, are described and this new model is applied in a businessMtoMconsumer eMcommerce case study. In addition, an entropyMbased measure is defined to quantify the information content of the GTM unsupervised maps about an externally imposed class label.