作者: Vasil Khalidov , Florence Forbes , Radu Horaud
DOI: 10.1162/NECO_A_00074
关键词: Modality (human–computer interaction) 、 Model selection 、 Pattern recognition 、 Convergence (routing) 、 Mathematics 、 Artificial intelligence 、 Cluster analysis 、 Initialization 、 Global optimization 、 Expectation–maximization algorithm 、 Mixture model
摘要: The problem of multimodal clustering arises whenever the data are gathered with several physically different sensors. Observations from modalities not necessarily aligned in sense there is no obvious way to associate or compare them some common space. A solution may consist considering multiple tasks independently for each modality. main difficulty such an approach guarantee that unimodal clusterings mutually consistent. In this letter, we show can be addressed within a novel framework: conjugate mixture models. These models exploit explicit transformations often available between unobserved parameter space (objects) and observation spaces (sensors). We formulate as likelihood maximization task derive associated expectation-maximization algorithm. convergence properties proposed algorithm thoroughly investigated. Several local global optimization techniques order increase its speed. Two initialization strategies compared. consistent model selection criterion proposed. variants tested evaluated 3D localization speakers using both auditory visual data.