lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <>
Subject Re: Multiple threads searching in Lucene and the synchronized issue. -- solution attached.
Date Tue, 09 May 2006 14:46:31 GMT
Yueyu Lin,

>From what I can tell from a quick look at the method, that method need to remain synchronized,
so multiple threads don't accidentally re-read that 'indexTerms' (Term[] type).  Even though
the method is synchronized, it looks like only the first invocation would enter that try/catch/finally
block where term reading happens.  Subsequent calls to this method should exist quickly, because
indexTerms != null.

Are you sure this is causing the bottleneck for you?
I believe the proper way to figure that out is to kill the JVM with a SIGnal that causes the
JVM to dump thread information.  That would tell you where the code is blocking.

Also, if you have concrete suggestions for code changes, please post them to JIRA as diffs/patches.


----- Original Message ----
From: yueyu lin <>
Sent: Tuesday, May 9, 2006 3:53:55 AM
Subject: Re: Multiple threads searching in Lucene and the synchronized issue. -- solution

Please trace the codes into the Lucene when searching.
Here is a table about how invokations are called.
The trace log:   *Steps*
  1.  public final Hits search(Query
query)  It will call another search function.   2.  public Hits search(Query query, Filter
filter)  Only one line code. It will new a Hits.
return new Hits(this, query, filter);   3. Hits(Searcher s, Query q, Filter f)
Next, we will trace into the constructor to see what stuffs will be
done.  4.  Hits(Searcher s, Query q, Filter f)
line 41 : weight = q.weight(s)  This call will rewrite the Query if
necessary, let us to see what will happen then.

  5.  public Weight weight(Searcher
line 92: Query query = searcher.rewrite(this);  This call will begin to
rewrite the Query.   6.  **  public
Query rewrite(Query original)  NOTE: we only have one IndexSearcher which
has one IndexReader. If there is any functioins that are synchronized, the
query process will be queued.   7. public Query rewrite(IndexReader
line 396: Query query = c.getQuery().rewrite(reader);  Here, BooleanQuery
will get its subqueries and call their rewrite function. The function will
require to pass a parameter: *IndexReader* that we only have one instance.
>From the codes we will notice *TermQuery* will not be rewrote and *
PrefixQuery* will be rewrote to several *TermQuery*s. So we ignore the *
TermQuery* and look into the *PrefixQuery*.   8.  public Query rewrite(IndexReader
line 41: TermEnum enumerator = reader.terms(prefix);  Let's see what will
happen then.   9.  org.apache.lucene.index.SegmentReader  public TermEnum
terms(Term t)
line 277: return tis.terms(t);  SegmentReader is in fact an IndexReader's
implementation.   10.  org.apache.lucene.index.TermInfosReader  public
SegmentTermEnum terms(Term term)
line 211:get(term);

  11.  org.apache.lucene.index.TermInfosReader  TermInfo get(Term term)
line 136:ensureIndexIsRead();  We finally find it!   12.
org.apache.lucene.index.TermInfosReader  private synchronized void
ensureIndexIsRead()  Let's analyze the function and to see why it's
synchronized and how to improve it.

On 5/9/06, Chris Hostetter <> wrote:
> :   We found if we were using 2 IndexSearcher, we would get 10%
> performance
> : benefit.
> :   But if we increased the number of IndexSearcher from 2, the
> performance
> : improvement became slight even worse.
> Why use more then 2 IndexSearchers?
> Typically 1 is all you need, except for when you want to open and "warm
> up" a new Searcher because you know your index has changed on disk and
> you're ready for those changes to be visible.
> (I'm not arguing against your change -- concurrancy isn't my forte so i
> have no opinion on wether your suggesting is good or not, i'm just
> questioning the goal)
> Acctually .. i don't know a lot about the internals of IndexSearcher and
> TermInfosReader, but according to your description of the problem...
> :   The class org.apache.lucene.index.TermInfosReader , as you know, every
> : IndexSearcher will have one TermInfosReader. Every query, one method in
> the
> : class must be called:
> : private synchronized void ensureIndexIsRead() throws IOException .
> Notice
> If the method isn't static, then how can two differnet instances of
> IndexSearcher, each with their own TermInfosReader, block one another?
> -Hoss
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Yueyu Lin

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message