lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject RE: Date ranges - getting the approach right
Date Sun, 16 Jul 2006 19:46:24 GMT
: The second approach requires three hits, doesn't it?
: (1) TermQuery on start date + sort on document ID
: (2) TermQuery on end date + reverse sort on document ID
: (3) The actual query with a filter on the above

You wouldn't need 3 queries for each date range you wanted to precompute,
you could use one pass over the TermEnum to seek forward to the first term
after the start of each range, and then seek a TermDocs to that term to
find the first doc in that range.   Then you can have a Filter that just
flat out knows the lowest and highest IDs for each range.

Basically you are precomputing all of your ranges in one low level
TermEnum scan instead of a seperate query/filter pass (which internally
does a TermEnum scan) for each range.

I've never really thought baout an approach like this (because i never
deal with indexes that only grow without deletes, and i *never* deal with
indexes where i know what order things are added in) but it seems like it
should work pretty well for ranges you know in advance are going to be
important ... those Filters should be pretty damn space efficient too --
they only have to keep track of two ints, not big BitSets lieing arround.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message