作者: Nieves R. Brisaboa , Rodrigo Cánovas , Francisco Claude , Miguel A. Martínez-Prieto , Gonzalo Navarro
DOI: 10.1007/978-3-642-20662-7_12
关键词:
摘要: The problem of storing a set strings - string dictionary in compact form appears naturally many cases. While classically it has represented small part the whole data to be processed (e.g., for Natural Language processing or indexing text collections), recent applications inWeb engines, RDF graphs, Bioinformatics, and others, handle very large dictionaries, whose size is significant fraction data. Thus efficient approaches compress them are necessary. In this paper we empirically compare time space performance some existing alternatives, as well new ones propose. We show that reductions up 20% original possible while supporting searches within few microseconds, 10% tens hundreds microseconds.