Document processing employing probabilistic topic modeling of documents represented as text words transformed to a continuous space

作者: Stéphane Clinchant , Florent Perronnin

DOI:

关键词:

摘要: A set of word embedding transforms are applied to transform text words a documents into K-dimensional vectors in order generate sets or sequences representing the documents. probabilistic topic model is learned using The an input document sequence document. assign probabilities for topics processing operation such as annotation, classification, similar retrieval may be performed assigned probabilities.

参考文章(36)
Magnus Sahlgren, An Introduction to Random Indexing terminology and knowledge engineering. ,(2005)
Stéphane Clinchant, Cyril Goutte, Eric Gaussier, Lexical Entailment for Information Retrieval Lecture Notes in Computer Science. pp. 217- 228 ,(2006) , 10.1007/11735106_20
Florent C. Perronnin, Yan Liu, Modeling images as mixtures of image models ,(2008)
Florent Perronnin, Christopher Dance, Gabriela Csurka, Marco Bressan, Adapted Vocabularies for Generic Visual Categorization Computer Vision – ECCV 2006. pp. 464- 475 ,(2006) , 10.1007/11744085_36
Florent Perronnin, Jorge Sánchez, Thomas Mensink, Improving the fisher kernel for large-scale image classification european conference on computer vision. ,vol. 6314, pp. 143- 156 ,(2010) , 10.1007/978-3-642-15561-1_11
G. Csurka, Visual categorization with bags of keypoints european conference on computer vision. ,vol. 1, pp. 22- ,(2004)
Ruhi Sarikaya, Brian Edward Doorenbos Kingsbury, Yuqing Gao, Yonggang Deng, Machine translation in continuous space conference of the international speech communication association. pp. 2350- 2353 ,(2008)
David M Blei, Andrew Y Ng, Michael I Jordan, None, Latent dirichlet allocation Journal of Machine Learning Research. ,vol. 3, pp. 993- 1022 ,(2003) , 10.5555/944919.944937