lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cedric Ho" <>
Subject Re: Are there any Lucene optimizations applicable to SSD?
Date Wed, 20 Aug 2008 13:58:19 GMT
Hi Toke,

>> Search response time. We used the search log from our production
>> system and test it with SSD. The results shows that 75% of queries
>> returns within 1 second, 90% returns in 2.5 seconds, the remaining 10%
>> ranges from 2.5 seconds to less than 100 seconds.
> Are the sub-second response times fairly equal or fluctuating?

fluctuating I would say. anything between 0.01 to 1 sec. Mostly center
around 0.5 secs.

> Do you have the figures for conventional harddisks?

It is much slower than SSD, I forgot the exact figures, but about 50%
returns within 1 sec, and those slow ones can go wild up to more then
100 secs

>> Total number of queries is ~40000, so about 10000 queries are kind of
>> slow, 1000 queries are very slow. But those 10% very slow queries are
>> not from the first 1000 queries. It's more or less evenly distributed.
> Is it the same queries that are slow each time?

For this I've done an interesting test previously on harddisks. I have
two set of similar index on two machines and I queried them with
exactly the same query together. And they show very similar pattern of
search response time for each query, including those very slow queries
with response time >1-10 secs.

However I can't figure out why some of these queries are slower. Some
are complicated queries, yet others are just simple single term
queries and doesn't seems to score lots of hits. There's no
correlation between the number of terms or number of hits with the
response time. It's probably not due to filter cache miss also because
I've checked that too.

> Our queries normally has fewer terms but gets expanded to 4-5 fields.
> We have no filters and very few ranges.
> Have you tried monitoring the memory-usage of the JVM? At one time we
> had a lot of short-term allocation of fairly large structures - this
> triggered a lot of full GC's and killed performance.
> This still doesn't account for all the > 1 second response times though.

We keep monitoring it with jconsole, but don't see any abnormal gc
activities there. And yes, gc probably won't affect such slow queries
that much.

> I'll see if I can steal time enough to make a test with sorting, but no
> promises.
>> We are targeting to get >90% of queries to return under 1 sec. Of
>> course the more the better =)
> That's about the same as our original goal, but we've gotten greedier in
> the meantime.

Thanks for the help =) We'll also keep trying different methods until
our goal is met.

Cedric Ho

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message