lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <wun...@wunderwood.org>
Subject Re: CPU utilization and query time high on Solr slave when snapshot install
Date Tue, 03 Nov 2009 16:20:16 GMT
Optimizing an index takes CPU, but if you are doing it on a machine  
dedicated to indexing, that does not matter. It will make queries  
faster.

wunder

On Nov 3, 2009, at 2:12 AM, bikumar@sapient.com wrote:

> Hi Walter,
>
> When the issue occurred, we did try to set autowarming off, but it  
> did not solve the problem. The only thing which worked, was  
> optimizing the slave index.
> But, what you say is logical and I will try it again.
>
> But, the basic question I have is, our solr index is not huge by any  
> means. Secondly, I have read in wiki etc. that optmize has adverse  
> impact on performance and hence should be done once a day. Then what  
> is wrong in our case, that is the cause of performance (we serve  
> just 4 req/sec)? Why is optimize fixing the issue contrary to normal  
> belief. What will this workaround impact us as the index size  
> increase?
>
> Regds,
> Bipul
>
> -----Original Message-----
> From: Walter Underwood [mailto:wunder@wunderwood.org]
> Sent: Monday, November 02, 2009 11:18 PM
> To: solr-user@lucene.apache.org
> Subject: Re: CPU utilization and query time high on Solr slave when  
> snapshot install
>
> If you are going to pull a new index every 10 minutes, try turning off
> cache autowarming.
>
> Your caches are never more than 10 minutes old, so spending a minute
> warming each new cache is a waste of CPU. Autowarm submits queries to
> the new Searcher before putting it in service. This will create a
> burst of query load on the new Searcher, often keeping one CPU pretty
> busy for several seconds.
>
> In solrconfig.xml, set autowarmCount to 0.
>
> Also, if you want the slaves to always have an optimized index, create
> the snapshot only in post-optimize. If you create snapshots in both
> post-commit and post-optimize, you are creating a non-optimized index
> (post-commit), then replacing it with an optimized one a few minutes
> later. A slave might get a non-optimized index one time, then an
> optimized one the next.
>
> wunder
>
> On Nov 2, 2009, at 1:45 AM, bikumar@sapient.com wrote:
>
>> Hi Solr Gurus,
>>
>> We have solr in 1 master, 2 slave configuration. Snapshot is created
>> post commit, post optimization. We have autocommit after 50
>> documents or 5 minutes. Snapshot puller runs as a cron every 10
>> minutes. What we have observed is that whenever snapshot is
>> installed on the slave, we see solrj client used to query slave
>> solr, gets timedout and there is high CPU usage/load avg. on slave
>> server. If we stop snapshot puller, then slaves work with no issues.
>> The system has been running since 2 months and this issue has
>> started to occur only now  when load on website is increasing.
>>
>> Following are some details:
>>
>> Solr Details:
>> apache-solr Version: 1.3.0
>> Lucene - 2.4-dev
>>
>> Master/Slave configurations:
>>
>> Master:
>> - for indexing data HTTPRequests are made on Solr server.
>> - autocommit feature is enabled for 50 docs and 5 minutes
>> - caching params are disable for this server
>> - mergeFactor of 10 is set
>> - we were running optimize script after every 2 hours, but now have
>> reduced the duration to twice a day but issue still persists
>>
>> Slave1/Slave2:
>> - standard requestHandler is being used
>> - default values of caching are set
>> Machine Specifications:
>>
>> Master:
>> - 4GB RAM
>> - 1GB JVM Heap memory is allocated to Solr
>>
>> Slave1/Slave2:
>> - 4GB RAM
>> - 2GB JVM Heap memory is allocated to Solr
>>
>> Master and Slave1 (solr1)are on single box and Slave2(solr2) on
>> different box. We use HAProxy to load balance query requests between
>> 2 slaves. Master is only used for indexing.
>> Please let us know if somebody has ever faced similar kind of issue
>> or has some insight into it as we guys are literally struck at the
>> moment with a very unstable production environment.
>>
>> As a workaround, we have started running optimize on master every 7
>> minutes. This seems to have reduced the severity of the problem but
>> still issue occurs every 2days now. please suggest what could be the
>> root cause of this.
>>
>> Thanks,
>> Bipul
>>
>>
>>
>>
>


Mime
View raw message