lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ariel" <ionat...@gmail.com>
Subject Re: Date range performance
Date Fri, 04 Apr 2008 19:38:32 GMT
Thanks! I'll try taking some precision and let you know about the result.

Looking into the code it seems like a Lucene problem, more than Solr. It is
in the RangeQuery and RangeFilter classes. The problem with changing this to
have a sorted index and than binary search is that you have to sort it,
which is slow. Unless we can store the ordered index somewhere and reuse it,
it will be even slower than now. And if we store it, we will have to face
the problem with updating ordered index with new terms.


On Fri, Apr 4, 2008 at 3:30 PM, Mike Klaas <mike.klaas@gmail.com> wrote:

> On 3-Apr-08, at 4:24 PM, Jonathan Ariel wrote:
>
> > Is this depends on the number of documents that matches the query or the
> > number of documents in the index?
> >
>
> This aspect is more depedent on the number of terms that the date query
> translates into.
>
>  If in a 3 million documents index my query matches 4, having date with a
> > precision of seconds could slow down the query?
> >
>
> Yes.  Solr does range queries by taking the disjunction of a bunch of term
> queries, so it is the total number of terms checked that is the limiting
> factor.
>
> It would be better to implement this using an ordered index that could be
> binary-searched, but Solr isn't currently designed for that (though I think
> range optimization algorithms would be a cool addition).
>
> -Mike
>
>
>
> > On Thu, Apr 3, 2008 at 7:45 PM, Mike Klaas <mike.klaas@gmail.com> wrote:
> >
> >
> > > On 3-Apr-08, at 2:14 PM, Jonathan Ariel wrote:
> > >
> > >  Hi,
> > > > I'm experiencing a really poor performance when using date ranges in
> > > > solr
> > > > query. Is it a know issue? is there any special consideration when
> > > > using
> > > > date ranges? It seems weird because I always thought date dates are
> > > > translated to strings, so internally lucene resolves everything the
> > > > same
> > > > way. So maybe the problem is with parsing the dates and traslating
> > > > it to
> > > > the
> > > > internal value?
> > > > Any suggestion?
> > > >
> > > >
> > > Range query is highly dependent on the total number of unique terms
> > > covered by the range.  If you are indexing dates with very high
> > > precision
> > > (e.g., milliseconds), this can consist of ridiculous numbers of terms.
> > >
> > > Try rounding the dates to something more granular when indexing.
> > >
> > > -Mike
> > >
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message