Optimal string clustering based on a Laplace-like mixture and EM algorithm on a set of strings

作者: Hitoshi Koyano , Morihiro Hayashida , Tatsuya Akutsu

DOI: 10.1016/J.JCSS.2019.07.003

关键词:

摘要: Abstract In this study, we address the problem of clustering string data in an unsupervised manner by developing a theory mixture model and EM algorithm for strings based on probability topological monoid developed our previous studies. We begin with introducing parametric distribution set strings, which has location dispersion parameters positive real number. develop iteration estimating distributions introduced demonstrate that converges to algorithm, cannot be explicitly written model, one strongly consistently estimates its as numbers observed iterations increase. finally derive procedure is asymptotically optimal sense posterior making correct classifications maximized.

参考文章(78)
Michael D. Perlman, On the strong consistency of approximate maximum likelihood estimators Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Theory of Statistics. ,(1972)
M. A. Aizerman, Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning Automation and Remote Control. ,vol. 25, pp. 821- 837 ,(1964)
Gerhard Paaß, Edda Leopold, Martha Larson, Jörg Kindermann, Stefan Eickeler, SVM Classification Using Sequences of Phonemes and Syllables european conference on principles of data mining and knowledge discovery. pp. 373- 384 ,(2002) , 10.1007/3-540-45681-3_31
D. Haussler, Convolution kernels on discrete structures Tech. Rep.. ,(1999)
J McLachlan, G, D. Peel, Finite Mixture Models ,(2000)
François Nicolas, Eric Rivals, Complexities of the centre and median string problems combinatorial pattern matching. pp. 315- 327 ,(2003) , 10.1007/3-540-44888-8_23
Hitoshi Koyano, Morihiro Hayashida, Tatsuya Akutsu, Maximum margin classifier working in a set of strings Proceedings of The Royal Society A: Mathematical, Physical and Engineering Sciences. ,vol. 472, pp. 20150551- ,(2016) , 10.1098/RSPA.2015.0551