lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Index size and performance degradation
Date Tue, 14 Jun 2011 12:05:45 GMT
Hmm, this sounds hairy :)

Are you sure NRTCachingDir won't work for you?

Mike McCandless

On Tue, Jun 14, 2011 at 5:58 AM, Ganesh <> wrote:
> Is it a bad idea to keep multiple shards in a single system?
> Regards
> Ganesh
> ----- Original Message -----
> From: "Toke Eskildsen" <>
> To: <>
> Sent: Tuesday, June 14, 2011 12:58 PM
> Subject: Re: Index size and performance degradation
>> On Sun, 2011-06-12 at 10:10 +0200, Itamar Syn-Hershko wrote:
>>> The whole point of my question was to find out if and how to make
>>> balancing on the SAME machine. Apparently thats not going to help and at
>>> a certain point we will just have to prompt the user to buy more hardware...
>> It really depends on your scenario. If you have few concurrent requests
>> and are looking to minimize latency, sharding might help; assuming you
>> have fast IO and multiple cores. You basically want to saturate all
>> available resources for all requests.
>> On the other hand, if throughput is the issue, sharding on a single
>> machine is counter-productive due to increased duplication and merging.
>>> Out of curiosity, isn't there anything that we can do to avoid that? for
>>> instance using memory-mapped files for the indexes? anything that would
>>> help us overcome OS limitations of that sort...
>> One standard advice for speeding up searches is using SSD's. Our
>> (admittedly old) experiments puts SSD-performance near RAM. With the
>> prices we have now, SSD's seems like an obvious choice for most setups.
>> We tried a few performance tests at different index sizes and for us,
>> index size vs. performance looked like the power law: Heavy performance
>> degradation in the beginning, less later. It makes sense when we look at
>> caching and it means that if you do not require stellar performance, you
>> can have very large indexes on few machines (cue Hathi Trust).
>> - Toke Eskildsen
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message