lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Samarendra Pratap <>
Subject Re: Sharding Techniques
Date Wed, 11 May 2011 08:10:14 GMT
Hi Tom,
 the more i am getting responses in this thread the more i feel that our
application needs optimization.

350 GB and less than 2 seconds!!! That's much more than my expectation :-)
(in current scenario).

*characteristics of slow queries:*
 there are a few reasons for greater search time

 1.Two of our fields contain decimal values but are not NumericField :( .
These fields are searched as a range. Whenever the ranges are larger and/or
both the fields are used in search the search time and server load goes
high. I have already started work to convert it to NumericField - but
suggestions and experiences are most welcome.

2. When queries (without two fields mentioned above) have a lot of
words/phrases search time is high. E.g I took a query with around 80 unique
terms (not words) in 5 fields. These terms occur repeatedly and become total
225 terms (non-unique). This particular query took 4.2 seconds. the 15
indexes used for this query were of total size 5 G.
Are 225 terms (80 unique) is a very big number?

and yes, slow queries are always slow. yes but obviously high load will add
up to their slowness.

Here I have another curiosity about something I noticed.
If I have a query like following:

title:xyz title:xyz title:xyz title:xyz title:xyz title:xyz title:xyz
title:xyz title:xyz title:xyz title:xyz

*Will lucene search for the term 11 times or it will reuse the results of
first term?*

If later is true (which I think is), is there any particular reason or it
may be optimized inside lucene?

On Tue, May 10, 2011 at 9:46 PM, Burton-West, Tom <>wrote:

> Hi Samar,
> >>Normal queries go fine under 500 ms but when people start searching
> >>"anything" some queries take up to > 100 seconds. Don't you think
> >>distributing smaller indexes on different machines would reduce the
> average
> >>.search time. (Although I have a feeling that search time for smaller
> queries
> >>may be slightly increased)
> What are the characteristics of your slow queries?  Can you give examples?
>   Are the slow queries always slow or only under heavy load?   What the
> bottleneck is and whether splitting into smaller indexes would help depends
> on just what your bottleneck is. It's not clear that your index is large
> enough that the size of the index is causing your bottleneck.
> We run indexes of about 350GB with average response times under 200ms and
> 99th percentile reponse times of under 2 seconds. (We have a very low qps
> rate however).
> Tom
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message