作者: Stéphan Clémençon , Romaric Gaudel , Jérémie Jakubowicz
DOI: 10.1007/978-3-642-23780-5_32
关键词: Algorithm 、 Symmetric group 、 Set (abstract data type) 、 Cardinality 、 Discrete mathematics 、 Cluster analysis 、 Fourier transform 、 Rank (computer programming) 、 Mathematics 、 Feature selection 、 Unsupervised learning
摘要: It is the purpose of this paper to introduce a novel approach clustering rank data on set possibly large cardinality n ∈ N*, relying upon Fourier representation functions defined symmetric group Sn. In present setup, covering wide variety practical situations, are viewed as distributions Cluster analysis aims at segmenting into homogeneous subgroups, hopefully very dissimilar in certain sense. Whereas considering dissimilarity measures/distances between non commutative Sn, coordinate manner by viewing it embedded [0, 1]n! for instance, hardly yields interpretable results and leads face obvious computational issues, evaluating closeness groups permutations domain may be much easier contrast. Indeed, few well-chosen (matrix) coefficients permit approximate efficiently two Sn well their degree dissimilarity, while describing global properties an fashion. Following footsteps recent advances automatic feature selection context unsupervised learning, we propose cast task rankings terms optimization criterion that can expressed simple manner. The effectiveness method proposed illustrated numerical experiments based artificial real data.