lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Performance issue
Date Mon, 02 Feb 2009 13:25:24 GMT
Prefix queries are expensive here. The problem is
that each one forms a very large OR clause on all
the terms that start with those two letters. For instance,
if a field in your index contained
mine
milanta
mica

a prefix search on "mi" would form
mine OR milanta OR mica.

Doing this across seven fields could get expensive.

Two things:
1> what is the problem you are trying to solve? Perhaps some
of the folks on the list can give you some suggestions. You can
think about many strategies depending upon what you want
to accomplish. A 300M index isn't very big, so you could, for
instance, think about indexing a separate field that contains only
the two beginning letters and search *that* in this case. I'll
assume that three letter prefix queries are OK.

2> How are you measuring query time? If you're measuring the
time it takes when you first start a searcher, be aware that the
first few queries are usually slow because the caches haven't
been filled. Further, are you measuring total response time or
are you measuring *just* the query time? It's possible that the
time is being spent assembling the response in your code
rather than actual searching. You might insert some timers
to determine that.

Best
Erick

On Mon, Feb 2, 2009 at 2:58 AM, Mittal, Sourabh (IDEAS) <
Sourabh-931.Mittal@morganstanley.com> wrote:

> Hi All,
>
> We face serious performance issues when users do 2 letter search e.g ho,
> jo, pa ma, um ar, ma fi etc. time taken between 10 - 15 secs.
> Below is our implementation details:
>
> 1. Search performs on 7 fields.
> 2. PrefixQuery implementation on all fields
> 3. AND search.
> 4. Our indexer size is 300 MB.
> 5. We show only 100 top documents only on the basis of score.
> 6. We user StandardAnalyzer & StandardTokenizer for indexing &
> searching.
> 7. Lucene 2.4
> 8. JDK1 .6
>
> Please suggest me how can we improve the performance.
>
> Regards,
> Sourabh Mittal
> Morgan Stanley | IDEAS Practice Areas
> Manikchand Ikon | South Wing 18 | Dhole Patil Road
> Pune, 411001
> Phone: +91 20 2620-7053
> Sourabh-931.Mittal@morganstanley.com
>
>
>
> --------------------------------------------------------------------------
> NOTICE: If received in error, please destroy and notify sender. Sender does
> not intend to waive confidentiality or privilege. Use of this email is
> prohibited when received in error.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message