lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Streeton" <mike.stree...@ardentia.co.uk>
Subject RE: Date ranges - getting the approach right
Date Thu, 20 Jul 2006 15:13:51 GMT
This is how we solve the range query problem using filters. The nice
part about it is you can use a range in a query so several ranges can be
ORed/ANDed or NOTed together if required, instead of applying a range
filter to the who query. (Assumes dates in YYYYMMDD format)

Hope this helps Mike.

Extend QueryParser:
	@Override
	protected Query getRangeQuery(String arg0, String arg1, String
arg2, boolean arg3) throws ParseException {
		return new FilteredQuery(new MatchAllDocsQuery(), new
RangeFilter(arg0,arg1,arg2,arg3,arg3));
	}

www.ardentia.com the home of NetSearch

-----Original Message-----
From: Chris Hostetter [mailto:hossman_lucene@fucit.org] 
Sent: 16 July 2006 20:46
To: java-user@lucene.apache.org
Subject: RE: Date ranges - getting the approach right

: The second approach requires three hits, doesn't it?
:
: (1) TermQuery on start date + sort on document ID
: (2) TermQuery on end date + reverse sort on document ID
: (3) The actual query with a filter on the above

You wouldn't need 3 queries for each date range you wanted to
precompute,
you could use one pass over the TermEnum to seek forward to the first
term
after the start of each range, and then seek a TermDocs to that term to
find the first doc in that range.   Then you can have a Filter that just
flat out knows the lowest and highest IDs for each range.

Basically you are precomputing all of your ranges in one low level
TermEnum scan instead of a seperate query/filter pass (which internally
does a TermEnum scan) for each range.

I've never really thought baout an approach like this (because i never
deal with indexes that only grow without deletes, and i *never* deal
with
indexes where i know what order things are added in) but it seems like
it
should work pretty well for ranges you know in advance are going to be
important ... those Filters should be pretty damn space efficient too --
they only have to keep track of two ints, not big BitSets lieing
arround.





-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message