作者: Alexander Ypma , Tom Heskes
DOI: 10.1007/978-3-540-39663-5_3
关键词:
摘要: We propose mixtures of hidden Markov models for modelling clickstreams web surfers. Hence, the page categorization is learned from data without need a (possibly cumbersome) manual categorization. provide an EM algorithm training mixture HMMs and show that additional static user can be incorporated easily to possibly enhance labelling users. Furthermore, we use prior knowledge generalization avoid numerical problems. parameter tying decrease danger overfitting reduce computational overhead. put flat on parameters deal with problem certain transitions between categories occur very seldom or not at all, in order ensure nonzero transition probability these nonetheless remains. In applications artificial real-world logs demonstrate usefulness our approach. train navigation patterns, correct model being learned. Moreover, 'satellite data' may labeling shorter patterns. When applying large Dutch commercial site, sensible categorizations are