Hi,
I found this paper recently published in proceedings of CIKM'10:
VSEncoding: efficient coding and fast decoding of integer lists via
dynamic programming, F. Silvestri, R. Venturini
Available here http://portal.acm.org/citation.cfm?id=1871437.1871592
My favorite quotes:
"We point out that only our methods, together with In
terpolative, are able to beat the entropy of the lists on the
datasets. This quasiparadoxical eect is, indeed, present
because entropy does not consider context information. En
tropy, or to use a notation commonly used in text compres
sion, zerothorder entropy, does not take into account pat
terns (i.e. the context) that can be present in lists of blocks
of integers. By grouping together blocks of integers, in fact,
we are able to assign codewords to more than a single value
at a time. Therefore, it appears obvious that we can beat
the entropy in the case of VSE, VSER and Interpolative. Es
sentially, this is possible since we exploit regularities on the
lists on these very skewed dgaps lists (e.g., small values close
to each other or quite long runs of 1s). We remark that beat
the entropy is not possible with any prex code (e.g., sta
tistical compressors like Arithmetic and Human or integer
encoders like , , 's, Golomb, and so on). Therefore, our
methods is certainly better in compression than any of these
kind of methods without the need of any comparison. Any
way, for the sake of completeness, we report those results as
well in Table 3."
"As expected, Interpolative is the slowest as opposed to VSE
which tops others with more than 800 millions of integers
per second. Our methods, VSE and VSER, are among the
fastest in decoding with a number of mis (millions of integers per
second) decoded ranging from 450 of VSER to 835 of
VSE both of them measured using the gov2 collection. It
is interesting to observe the better performance in terms
of decoding speed of VSE with respect to others, and in
particular with respect to OPTP4D, Simple9 and Simple16
which are considered stateoftheart as far as decompression
speed is concerned."
P4D compression achieved a speed of 460 mis in this test, which means
the VSE method is twice as fast.

Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__  ____/__\/ Information Retrieval, Semantic Web
_____ \   Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com

To unsubscribe, email: devunsubscribe@lucene.apache.org
For additional commands, email: devhelp@lucene.apache.org
