lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "yueyu lin" <>
Subject Re: Multiple threads searching in Lucene and the synchronized issue. -- solution attached.
Date Tue, 09 May 2006 07:53:55 GMT
Please trace the codes into the Lucene when searching.
Here is a table about how invokations are called.
The trace log:   *Steps*
  1.  public final Hits search(Query
query)  It will call another search function.   2.  public Hits search(Query query, Filter
filter)  Only one line code. It will new a Hits.
return new Hits(this, query, filter);   3. Hits(Searcher s, Query q, Filter f)
Next, we will trace into the constructor to see what stuffs will be
done.  4.  Hits(Searcher s, Query q, Filter f)
line 41 : weight = q.weight(s)  This call will rewrite the Query if
necessary, let us to see what will happen then.

  5.  public Weight weight(Searcher
line 92: Query query = searcher.rewrite(this);  This call will begin to
rewrite the Query.   6.  **  public
Query rewrite(Query original)  NOTE: we only have one IndexSearcher which
has one IndexReader. If there is any functioins that are synchronized, the
query process will be queued.   7. public Query rewrite(IndexReader
line 396: Query query = c.getQuery().rewrite(reader);  Here, BooleanQuery
will get its subqueries and call their rewrite function. The function will
require to pass a parameter: *IndexReader* that we only have one instance.
>From the codes we will notice *TermQuery* will not be rewrote and *
PrefixQuery* will be rewrote to several *TermQuery*s. So we ignore the *
TermQuery* and look into the *PrefixQuery*.   8.  public Query rewrite(IndexReader
line 41: TermEnum enumerator = reader.terms(prefix);  Let's see what will
happen then.   9.  org.apache.lucene.index.SegmentReader  public TermEnum
terms(Term t)
line 277: return tis.terms(t);  SegmentReader is in fact an IndexReader's
implementation.   10.  org.apache.lucene.index.TermInfosReader  public
SegmentTermEnum terms(Term term)
line 211:get(term);

  11.  org.apache.lucene.index.TermInfosReader  TermInfo get(Term term)
line 136:ensureIndexIsRead();  We finally find it!   12.
org.apache.lucene.index.TermInfosReader  private synchronized void
ensureIndexIsRead()  Let's analyze the function and to see why it's
synchronized and how to improve it.

On 5/9/06, Chris Hostetter <> wrote:
> :   We found if we were using 2 IndexSearcher, we would get 10%
> performance
> : benefit.
> :   But if we increased the number of IndexSearcher from 2, the
> performance
> : improvement became slight even worse.
> Why use more then 2 IndexSearchers?
> Typically 1 is all you need, except for when you want to open and "warm
> up" a new Searcher because you know your index has changed on disk and
> you're ready for those changes to be visible.
> (I'm not arguing against your change -- concurrancy isn't my forte so i
> have no opinion on wether your suggesting is good or not, i'm just
> questioning the goal)
> Acctually .. i don't know a lot about the internals of IndexSearcher and
> TermInfosReader, but according to your description of the problem...
> :   The class org.apache.lucene.index.TermInfosReader , as you know, every
> : IndexSearcher will have one TermInfosReader. Every query, one method in
> the
> : class must be called:
> : private synchronized void ensureIndexIsRead() throws IOException .
> Notice
> If the method isn't static, then how can two differnet instances of
> IndexSearcher, each with their own TermInfosReader, block one another?
> -Hoss
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Yueyu Lin

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message