lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Search Performance Problem 16 sec for 250K docs
Date Sun, 20 Aug 2006 13:16:21 GMT
About luke... I don't know about command-line interfaces, but if you copy
your index to a different machine and use Luke there. I do this between
Linux and Windows boxes all the time. Or, if you can mount the remote drive
so you can see it, you can just use Luke to browse to it and open it up. You
may have some latency though.....

See below...

On 8/20/06, M A <geneticflyer@googlemail.com> wrote:
>
> Ok I get your point, this still however means the first search on the new
> searcher will take a huge amount of time .. given that this is happening
> now
> ..


You can fire one or several canned queries at the searcher whenever you open
a new one. That way the first time a *user* hits the box, the warm-up will
already have happened. Note that the same searcher can be used by multiple
threads...


i.e. new search -> new query -> get hits ->20+ secs ..  this happens every 5
> mins or so ..
>
> although subsequent searches may be quicker ..
>
> Am i to assume for a first search the amount of  time is ok -> .. seems
> like
> a long time to me ..?
>
> The other thing is the sorting is fixed .. it never changes .. it is
> always
> sorted by the same field ..


Assuming that you still have performance issues, you could think about
building your index in pre-sorted order an just avoiding the sorting all
together. The internal Lucene document IDs are then your sort order (a newly
added doc hast an ID that is always greater than any existing doc ID). I
don't know details of your problem space, but this might be relatively
easy.... You won't want to return things in relevance order in that case. In
fact, you probably don't want relevance in place at all since your sorting
doesn't change.... I think a ConstantScoreQuery  might work for you here.

But I wouldn't go there unless you have evidence that your sort is slowing
you down, which is easy enough to verify by just taking it out. Don't bother
with any of this until you re-use your reader though....

i just built the entire index and it still takes ages .,..


The search took ages? Or building the index? If the former, then rebuilding
the index is irrelevant, it's the first time you use a searcher that counts.

On 8/20/06, Chris Hostetter <hossman_lucene@fucit.org> wrote:
> >
> >
> > : This is because the index is updated every 5 mins or so, due to the
> > incoming
> > : feed of stories ..
> > :
> > : When you say iteration, i take it you mean, search request, well for
> > each
> > : search that is conducted I create a new one .. search reader that is
> ..
> >
> > yeah ... i ment iteration of your test.  don't do that.
> >
> > if the index is updated every 5 minutes, then open a new searcher every
> 5
> > minutes -- and reuse it for theentire 5 minutes.  if it's updated
> > "sparadically throughout the day" then open a search, and keep using it
> > untill the index is udated, then open a new one.
> >
> > reusing an indexsearcher as long as possible is one of biggest factors
> of
> > Lucene applications.
> >
> > :
> > :
> > :
> > : On 8/19/06, Chris Hostetter <hossman_lucene@fucit.org> wrote:
> > : >
> > : >
> > : > :     hits = searcher.search(query, new Sort("sid", true));
> > : >
> > : > you don't show where searcher is initialized, and you don't clarify
> > how
> > : > you are timing your multiple iterations -- i'm going to guess that
> you
> > are
> > : > opening a new searcher every iteration right?
> > : >
> > : > sorting on a field requires pre-computing an array of information
> for
> > that
> > : > field -- this is both time and space expensive, and is cached per
> > : > IndexReader/IndexSearcher -- so if you reuse the same searcher and
> > time
> > : > multiple iterations you'll find that hte first iteration might be
> > somewhat
> > : > slow, but the rest should be very fast.
> > : >
> > : >
> > : >
> > : > -Hoss
> > : >
> > : >
> > : >
> ---------------------------------------------------------------------
> > : > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > : > For additional commands, e-mail: java-user-help@lucene.apache.org
> > : >
> > : >
> > :
> >
> >
> >
> > -Hoss
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message