Permutations as a means to encode order in word space

作者: Magnus Sahlgren , Pentti Kanerva , Anders Holst

DOI:

关键词: Representation (mathematics)SequenceCoordinate vectorWord (computer architecture)Convolution (computer science)Natural language processingRandom indexingArtificial intelligenceWord orderMathematicsSentence

摘要: We show that sequence information can be encoded into high-dimensional fixed-width vectors using permutations of coordinates. Computational models language often represent words with semantic compiled from word-use statistics. A word's vector usually encodes the contexts in which word appears a large body text but ignores order. However, order signals grammatical role sentence and thus tells meaning. Jones Mewhort (2007) included holographic reduced representation convolution. here captured also by permuting coordinates, providing general computationally light alternative to

参考文章(10)
Jan Kristoferson, Pentti Kanerva, Anders Holst, Random indexing of text samples for latent semantic analysis conference cognitive science. ,vol. 22, ,(2000)
Michael N. Jones, Douglas J. K. Mewhort, Representing word meaning and order information in a composite holographic lexicon. Psychological Review. ,vol. 114, pp. 1- 37 ,(2007) , 10.1037/0033-295X.114.1.1
Sebastian Padó, Mirella Lapata, Dependency-Based Construction of Semantic Space Models Computational Linguistics. ,vol. 33, pp. 161- 199 ,(2007) , 10.1162/COLI.2007.33.2.161
Magnus Sahlgren, Jussi Karlgren, From Words to Understanding CSLI Publications. pp. 294- 308 ,(2001)
Martin Redington, Nick Chater, Steven Finch, Distributional Information: A Powerful Cue for Acquiring Syntactic Categories Cognitive Science. ,vol. 22, pp. 425- 469 ,(1998) , 10.1207/S15516709COG2204_2
Christos H. Papadimitriou, Hisao Tamaki, Prabhakar Raghavan, Santosh Vempala, Latent semantic indexing: a probabilistic analysis symposium on principles of database systems. pp. 159- 168 ,(1998) , 10.1145/275487.275505
S. Kaski, Dimensionality reduction by random mapping: fast similarity computation for clustering international joint conference on neural network. ,vol. 1, pp. 413- 418 ,(1998) , 10.1109/IJCNN.1998.682302
Dominic Widdows, Unsupervised methods for developing taxonomies by combining syntactic and statistical information Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL '03. pp. 197- 204 ,(2003) , 10.3115/1073445.1073481
T.A. Plate, Holographic reduced representations IEEE Transactions on Neural Networks. ,vol. 6, pp. 623- 641 ,(1995) , 10.1109/72.377968