lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: Improving Solr performance
Date Sun, 09 Jan 2011 04:01:18 GMT
Are you using the Solr caches? These are configured in solrconfig.xml
in each core. Make sure you have at least 50-100 configured for each
kind.

Also, use filter queries: a filter query describes a subset of
documents. When you do a bunch of queries against the same filter
query, the second and subsequent queries are much much faster.

All of this is explained in the Enterprise Solr 1.4 book.

On Fri, Jan 7, 2011 at 7:20 AM, mike anderson <saidtherobot@gmail.com> wrote:
> Making sure the index can fit in memory (you don't have to allocate that
> much to Solr, just make sure it's available to the OS so it can cache it --
> otherwise you are paging the hard drive, which is why you are probably IO
> bound) has been the key to our performance. We recently opted to use less
> RAM and store the indices on SSDs, we're still evaluating this approach but
> so far it seems to be comparable, so I agree with Toke! (We have 18 shards
> and over 100GB of index).
>
> On Fri, Jan 7, 2011 at 10:07 AM, Toke Eskildsen <te@statsbiblioteket.dk>wrote:
>
>> On Fri, 2011-01-07 at 10:57 +0100, supersoft wrote:
>>
>> [5 shards, 100GB, ~20M documents]
>>
>> ...
>>
>> [Low performance for concurrent searches]
>>
>> > Using JConsole for monitoring the server java proccess I checked that
>> Heap
>> > Memory and the CPU Usages don't reach the upper limits so the server
>> > shouldn't perform as overloaded.
>>
>> If memory and CPU is okay, the culprit is I/O.
>>
>> Solid state Drives has more than proven their worth for random access
>> I/O, which is used a lot when searching with Solr/Lucene. SSD's are
>> plug-in replacements for harddrives and they virtually eliminate I/O
>> performance bottlenecks when searching. This also means shortened warm
>> up requirements and less need for disk caching. Expanding RAM capacity
>> does not scale well and requires extensive warmup. Adding more machines
>> is expensive and often require architectural changes. With the current
>> prices for SSD's, I consider them the generic first suggestion for
>> improving search performance.
>>
>> Extra spinning disks improves the query throughput in general and speeds
>> up single queries when the chards are searched in parallel. They do not
>> help much for a single sequential searching of shards as the seek time
>> for a single I/O request is the same regardless of the number of drives.
>> If your current response time for a single user is satisfactory, adding
>> drives is a viable solution for you. I'll still recommend the SSD option
>> though, as it will also lower the response time for a single query.
>>
>> Regards,
>> Toke Eskildsen
>>
>>
>



-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message