lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <>
Subject Re: TrieRange
Date Sat, 07 Feb 2009 00:52:55 GMT
On Fri, Feb 6, 2009 at 6:18 PM, Uwe Schindler <> wrote:
>> Encoding a slice per character makes the code simpler, but increases
>> the size of the index... but perhaps not enough to worry about in
>> practice?
> This is correct. For 2bit and 4bit there is a lot of overhead by this, but
> there is no way round (any ideas how to fix this?). But 8bit is the most
> compact one. There needs to be more testing and benchmarking.

Separate bit slicing and String encoding.... they are independent.
If a,b,c,d are prefix codes designating precision, and w,x,y,z are
each 2 bits of the number, then


Everything after the prefix can be encoded in a single character in each case.

Lucene's prefix encoding of the index will remove some of the
redundancy... buy only for numbers that are very packed together.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message