lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject Re: Regarding range queries.
Date Tue, 09 Aug 2005 16:48:28 GMT

If your improvements are of general utility, please contribute them. 
Even if they are not, post them as-is and perhaps someone will take the 
time to make them more reusable.



Tony Schwartz wrote:
> I think there are a few things that should be added to lucene to really give a huge
> benefit to applications of lucene that are centered around dates.  If documents are
> added in date order (generally but not exactly), you can use this fact to improve memory
> usage of lucene in several ways.
> 1.  a sparse bitset can be used instead of a full array for Date RangeFilters.
> 2.  sorting can improved by storing the StringIndex (sort array) to disk when index is
> updated.  Then, load only the portions required for a particular search.  If most people
> will be searching more recent docs and so you can keep those portions of the sort array
> in memory and load only those "older" portions when needed.
> 3.  use the same sparse (and reversible) bitset instead of the lucene BitVector for
> storing the deleted docs for a particular segment. (very old docs are probably deleted
> again, based on date).
> 4.  sorting can also be greatly improved by NOT storing the field values in memory if
> the index is not used in a "multi-index" environment.
> I have implemented these techniques for my particular implementation of an application
> logs search tool and have seen incredible results.  I have many users searching 50
> million application logs (1k each) with 512 MB memory for my app where users are sorting
> and filtering on every search.
> Again, these features will only be useful for indexes that have relative date to docid
> correlation (which I believe happens to be very common).
> Tony Schwartz
> "What we need is more cowbell."

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message