Identifying hierarchical structure in sequences: a linear-time algorithm

作者: C. G. Nevill-Manning , I. H. Witten

DOI: 10.1613/JAIR.374

关键词:

摘要: SEQUITUR is an algorithm that infers a hierarchical structure from sequence of discrete symbols by replacing repeated phrases with grammatical rule generates the phrase, and continuing this process recursively. The result representation original sequence, which offers insights into its lexical structure. driven two constraints reduce size grammar, produce as by-product. breaks new ground operating incrementally. Moreover, method's simple permits proof it operates in space time linear input. Our implementation can 50,000 per second has been applied to extensive range real world sequences.

参考文章(23)
Geoffrey N. Leech, Johansson. Stig, Helen Goodluck, Manual of information to accompany the Lancaster-Oslo : Bergen Corpus of British English, for use with digital computers Department of English, University of Oslo. ,(1978)
John H. Andreae, Thinking with the teachable machine ,(1977)
David Kurlander, Allen Cypher, Daniel Conrad Halbert, None, Watch what I do: programming by demonstration MIT Press. ,(1993)
J. Gerard Wolfp, Language Acquisition and the Discovery of Phrase Structure. Language and Speech. ,vol. 23, pp. 255- 269 ,(1980) , 10.1177/002383098002300303
Andreas Stolcke, Stephen Omohundro, Inducing Probabilistic Grammars by Bayesian Model Merging international colloquium on grammatical inference. pp. 106- 118 ,(1994) , 10.1007/3-540-58473-0_141
E Mark Gold, Language identification in the limit Information & Computation. ,vol. 10, pp. 447- 474 ,(1967) , 10.1016/S0019-9958(67)91165-5
Robert E. Odeh, Donald E. Knuth, The art of computer programming, volume 1 (3rd ed.): fundamental algorithms Journal of the American Statistical Association. ,vol. 64, pp. 401- ,(1997) , 10.2307/2283757
Kurt Vanlehn, William Ball, A Version Space Approach to Learning Context-free Grammars Machine Learning. ,vol. 2, pp. 39- 74 ,(1987) , 10.1023/A:1022812926936
Craig G. Nevill-Manning, Ian H. Witten, Gordon W. Paynter, Browsing in digital libraries: a phrase-based approach acm international conference on digital libraries. pp. 230- 236 ,(1997) , 10.1145/263690.263826
Craig M. Cook, Azriel Rosenfeld, Alan R. Aronson, Grammatical inference by Hill Climbing Information Sciences. ,vol. 10, pp. 59- 80 ,(1976) , 10.1016/S0020-0255(76)90602-2