作者: Craig G Nevill-Manning , Ian H Witten
DOI: 10.1093/COMJNL/40.2_AND_3.103
关键词:
摘要: This paper describes an algorithm, called SEQUITUR, that identi"es hierarchical structure in sequences of discrete symbols and uses information for compression. On many practical it performs well at both compression structural inference, producing comprehensible descriptions sequence the form grammar rules. The algorithm can be stated concisely two constraints on a context-free grammar. Inference is performed incrementally, faithfully representing input all times. It implemented ef"ciently operates time approximately linear length. Despite its simplicity ef"ciency, SEQUITUR succeeds inferring range interesting structures from naturally occurring sequences.