作者: Richard Nock , Panu Luosto , Jyrki Kivinen
DOI: 10.1007/978-3-540-87481-2_11
关键词: Exponential family 、 Mathematics 、 Simple (abstract algebra) 、 Symmetrization 、 Distortion 、 Mathematical optimization 、 Algorithm 、 Cluster analysis 、 Initialization 、 Euclidean geometry 、 Bregman divergence
摘要: Two recent breakthroughs have dramatically improved the scope and performance of k-means clustering: squared Euclidean seeding for initialization step, Bregman clustering iterative step. In this paper, we first unite two frameworks by generalizing former improvement to seeding-- a biased randomized technique using divergences -- while its important theoretical approximation guarantees as well. We end up with complete hard algorithm integrating distortion at hand in both steps. Our second contribution is further generalize handle mixed distortions, which smooth out asymetricity divergences. contrast some other symmetrization approaches, our approach keeps simple allows us from regular clustering. Preliminary experiments show that proposed suitable divergence can help discover underlying structure data.