作者: C. G. Nevill-Manning , I. H. Witten
DOI: 10.1613/JAIR.374
关键词:
摘要: SEQUITUR is an algorithm that infers a hierarchical structure from sequence of discrete symbols by replacing repeated phrases with grammatical rule generates the phrase, and continuing this process recursively. The result representation original sequence, which offers insights into its lexical structure. driven two constraints reduce size grammar, produce as by-product. breaks new ground operating incrementally. Moreover, method's simple permits proof it operates in space time linear input. Our implementation can 50,000 per second has been applied to extensive range real world sequences.