lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Mercouris <jmercou...@gmail.com>
Subject Re: Date Range Query Feature Implementation
Date Mon, 30 Apr 2012 08:35:54 GMT
Hello Uwe, you bring up some very valid points. We did not utilize those classes as we were
not aware of how to use these classes. Given what you say, it may make sense to rewrite our
implementation as a sort of user level library or package of files that can be used together
to easily implement date range searching, like a date range parser, a date range index and
so on and so forth. Basically a set of utilities.

Also while, it is indeed infinitely faster to use Conjunction scorer, bear in mind our algorithm
only does an intersection on the set of results returned in IndexSearcher, this means that
we are perhaps doing intersection on a smaller subset. 

The methodology described does a query for a date range on an entire index, and then does
a query for a term on an entire index and then intersects those results which may be slower.

I imagine most users don't look beyond the top twenty documents anyway, so there is no reason
to query the entire index for the subset of documents that fit that date range. A "lazy" (term
used loosely) loading type of solution may be best, because if you really break it down, a
date range is more like a filter for a set of results, and less of something that you have
to query against the entire database.

Given the aforementioned concepts, perhaps a combination of the two ideas may be the best
solution for an implementation, I will continue to think about it, thank you very much for
your input.

Again, these are just some ideas I am throwing around here, I obviously can't speak in absolute
terms because I do not know Lucene very well, but these are some thoughts I am having. Any
and all feedback is appreciated, and once again,

Thank you for your input,

-John

On Apr 30, 2012, at 3:03 AM, Uwe Schindler wrote:

> Hi,
> 
> Thanks for your input. One citation from your report:
> 
> "These types of searches are uncommon, and thus programmers don't optimize
> for this case. Lucene, for example, has the ability to filter search results
> using date-ranges, but it is a slow, naive algorithm implemented through
> lexographic range searching on a custom field. Which is a user level hack
> that works ineffectively. There are no known other ways of performing a date
> range search."
> 
> Since Lucene 2.9 / Solr 1.4, Lucene can handle numerical ranges without "a
> slow, naive algorithm", see NumericRangeQuery and NumericField. As every
> date can be represented as a number (e.g. year as integer, or milliseconds
> since 1970 as long,...), date searches can be done easily with Lucene (and
> very fast, because the intersection between the NumericRangeQuery and the
> TermQuery are done using ConjunctionScorer which does *not* naivly iterate
> the postings).
> 
> Did you consider this in your implementation?
> 
> Uwe
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
>> -----Original Message-----
>> From: John Mercouris [mailto:jmercouris@gmail.com]
>> Sent: Monday, April 30, 2012 9:24 AM
>> To: dev@lucene.apache.org
>> Subject: Date Range Query Feature Implementation
>> 
>> Hello we (John Mercouris & Nick Zivkovic) have implemented date range
>> search functionality into Lucene as part of a class project. The
> implementation
>> is detailed in the PDF attached. The source is available for download from
>> github at the URL: git://github.com/cs429-ir/date-range-search.git
>> 
>> We hope that you find this useful,
>> 
>> -John & Nick
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message