lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wojtekpia <wojte...@hotmail.com>
Subject RE: Throughput Optimization
Date Wed, 05 Nov 2008 19:07:57 GMT

My documentCache hit rate is ~.7, and my queryCache is ~.03. I'm using
FastLRUCache on all 3 of the caches.


Feak, Todd wrote:
> 
> What are your other cache hit rates looking like?
> Which caches are you using the FastLRUCache on?
> 
> -Todd Feak
> 
> -----Original Message-----
> From: wojtekpia [mailto:wojtek_p@hotmail.com] 
> Sent: Wednesday, November 05, 2008 8:15 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Throughput Optimization
> 
> 
> Yes, I am seeing evictions. I've tried setting my filterCache higher,
> but
> then I start getting Out Of Memory exceptions. My filterCache hit ratio
> is >
> .99. It looks like I've hit a RAM bound here.
> 
> I ran a test without faceting. The response times / throughput were both
> significantly higher, there were no evictions from the filter cache, but
> I
> still wasn't getting > 50% CPU utilization. Any thoughts on what
> physical
> bound I've hit in this case?
> 
> 
> 
> Erik Hatcher wrote:
>> 
>> One quick question.... are you seeing any evictions from your  
>> filterCache?  If so, it isn't set large enough to handle the faceting
> 
>> you're doing.
>> 
>> 	Erik
>> 
>> 
>> On Nov 4, 2008, at 8:01 PM, wojtekpia wrote:
>> 
>>>
>>> I've been running load tests over the past week or 2, and I can't  
>>> figure out
>>> my system's bottle neck that prevents me from increasing throughput.
> 
>>> First
>>> I'll describe my Solr setup, then what I've tried to optimize the  
>>> system.
>>>
>>> I have 10 million records and 59 fields (all are indexed, 37 are  
>>> stored, 17
>>> have termVectors, 33 are multi-valued) which takes about 15GB of  
>>> disk space.
>>> Most field values are very short (single word or number), and  
>>> usually about
>>> half the fields have any data at all. I'm running on an 8-core, 64- 
>>> bit, 32GB
>>> RAM Redhat box. I allocate about 24GB of memory to the java process,
> 
>>> and my
>>> filterCache size is 700,000. I'm using a version of Solr between 1.3
> 
>>> and the
>>> current trunk (including the latest SOLR-667 (FastLRUCache) patch),  
>>> and
>>> Tomcat 6.0.
>>>
>>> I'm running a ramp-test, increasing the number of users every few  
>>> minutes. I
>>> measure the maximum number of requests that Solr can handle per  
>>> second with
>>> a fixed response time, and call that my throughput. I'd like to see  
>>> a single
>>> physical resource be maxed out at some point during my test so I  
>>> know it is
>>> my bottle neck. I generated random queries for my dataset  
>>> representing a
>>> more or less realistic scenario. The queries include faceting by up  
>>> to 6
>>> fields, and quering by up to 8 fields.
>>>
>>> I ran a baseline on the un-optimized setup, and saw peak CPU usage  
>>> of about
>>> 50%, IO usage around 5%, and negligible network traffic.  
>>> Interestingly, the
>>> CPU peaked when I had 8 concurrent users, and actually dropped down  
>>> to about
>>> 40% when I increased the users beyond 8. Is that because I have 8  
>>> cores?
>>>
>>> I changed a few settings and observed the effect on throughput:
>>>
>>> 1. Increased filterCache size, and throughput increased by about  
>>> 50%, but it
>>> seems to peak.
>>> 2. Put the entire index on a RAM disk, and significantly reduced the
> 
>>> average
>>> response time, but my throughput didn't change (i.e. even though my  
>>> response
>>> time was 10X faster, the maximum number of requests I could make per
> 
>>> second
>>> didn't increase). This makes no sense to me, unless there is another
> 
>>> bottle
>>> neck somewhere.
>>> 3. Reduced the number of records in my index. The throughput  
>>> increased, but
>>> the shape of all my graphs stayed the same, and my CPU usage was  
>>> identical.
>>>
>>> I have a few questions:
>>> 1. Can I get more than 50% CPU utilization?
>>> 2. Why does CPU utilization fall when I make more than 8 concurrent
>>> requests?
>>> 3. Is there an obvious bottleneck that I'm missing?
>>> 4. Does Tomcat have any settings that affect Solr performance?
>>>
>>> Any input is greatly appreciated.
>>>
>>> -- 
>>> View this message in context:
>>>
> http://www.nabble.com/Throughput-Optimization-tp20335132p20335132.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/Throughput-Optimization-tp20335132p20343425.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Throughput-Optimization-tp20335132p20346663.html
Sent from the Solr - User mailing list archive at Nabble.com.


Mime
View raw message