作者: Antonio Restivo , Giovanna Rosone
DOI: 10.1016/J.TCS.2010.11.040
关键词:
摘要: Compression algorithms based on Burrows-Wheeler transform (BWT) take advantage of the fact that word output BWT shows a local similarity and then turns out to be highly compressible. The aim present paper is study such ''clustering effect'' by using notions methods from Combinatorics Words. notion balance plays central role in our investigation. Empirical observations suggest actually combinatorial property input ensure optimal compression. Moreover, it reasonable assume more balanced is, we have after (and therefore better compression is). This hypothesis here corroborated experiments ''real'' text, entropy as measure degree word. In setting Words, sound confirmation previous given result Mantaci et al. (2003) [27], which states that, case binary alphabet, there an equivalence between circularly words, words having clusterized BWT, conjugates standard words. alphabets size greater than two, no equivalence. last section devoted investigate relationships these notions, other related ones (as, for instance, palindromic richness) general alphabet.