DOI: 10.1007/BF01759045
关键词: Statistical model 、 Mathematics 、 Markov chain 、 Trie 、 Suffix 、 Suffix tree 、 Discrete mathematics 、 Combinatorics 、 Prefix 、 Word (computer architecture) 、 String (computer science)
摘要: This paper studies in a probabilistic framework some topics concerning the way words (strings) can overlap, and relationship of this to height digital trees associated with set words. A word is defined as random sequence (possibly infinite) symbols over finite alphabet. key notion analignment matrix {Cij}i,j=1n introduced whereCij length longest string that prefix theith thejth word. It proved an tree simply related alignment through order statistics. In particular, using observation proving inequalities for statistics, we establish adigital trie under anindependent model (i.e., all are statistically independent) asymptotically equal 2 logźn wheren number stored ź parameter model. result generalized three directions, namely considerb-tries,Markovian dependency among letters word), adependent words). when consecutive Markov dependent (Markovian model), then demonstrate converges probability where underlying chain. On other hand, suffix which fall into model, show does not exceed logźn, These results find plenty applications analysis data structures built