lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Feak, Todd" <Todd.F...@smss.sony.com>
Subject RE: Throughput Optimization
Date Wed, 05 Nov 2008 16:49:34 GMT
What are your other cache hit rates looking like?
Which caches are you using the FastLRUCache on?

-Todd Feak

-----Original Message-----
From: wojtekpia [mailto:wojtek_p@hotmail.com] 
Sent: Wednesday, November 05, 2008 8:15 AM
To: solr-user@lucene.apache.org
Subject: Re: Throughput Optimization


Yes, I am seeing evictions. I've tried setting my filterCache higher,
but
then I start getting Out Of Memory exceptions. My filterCache hit ratio
is >
.99. It looks like I've hit a RAM bound here.

I ran a test without faceting. The response times / throughput were both
significantly higher, there were no evictions from the filter cache, but
I
still wasn't getting > 50% CPU utilization. Any thoughts on what
physical
bound I've hit in this case?



Erik Hatcher wrote:
> 
> One quick question.... are you seeing any evictions from your  
> filterCache?  If so, it isn't set large enough to handle the faceting

> you're doing.
> 
> 	Erik
> 
> 
> On Nov 4, 2008, at 8:01 PM, wojtekpia wrote:
> 
>>
>> I've been running load tests over the past week or 2, and I can't  
>> figure out
>> my system's bottle neck that prevents me from increasing throughput.

>> First
>> I'll describe my Solr setup, then what I've tried to optimize the  
>> system.
>>
>> I have 10 million records and 59 fields (all are indexed, 37 are  
>> stored, 17
>> have termVectors, 33 are multi-valued) which takes about 15GB of  
>> disk space.
>> Most field values are very short (single word or number), and  
>> usually about
>> half the fields have any data at all. I'm running on an 8-core, 64- 
>> bit, 32GB
>> RAM Redhat box. I allocate about 24GB of memory to the java process,

>> and my
>> filterCache size is 700,000. I'm using a version of Solr between 1.3

>> and the
>> current trunk (including the latest SOLR-667 (FastLRUCache) patch),  
>> and
>> Tomcat 6.0.
>>
>> I'm running a ramp-test, increasing the number of users every few  
>> minutes. I
>> measure the maximum number of requests that Solr can handle per  
>> second with
>> a fixed response time, and call that my throughput. I'd like to see  
>> a single
>> physical resource be maxed out at some point during my test so I  
>> know it is
>> my bottle neck. I generated random queries for my dataset  
>> representing a
>> more or less realistic scenario. The queries include faceting by up  
>> to 6
>> fields, and quering by up to 8 fields.
>>
>> I ran a baseline on the un-optimized setup, and saw peak CPU usage  
>> of about
>> 50%, IO usage around 5%, and negligible network traffic.  
>> Interestingly, the
>> CPU peaked when I had 8 concurrent users, and actually dropped down  
>> to about
>> 40% when I increased the users beyond 8. Is that because I have 8  
>> cores?
>>
>> I changed a few settings and observed the effect on throughput:
>>
>> 1. Increased filterCache size, and throughput increased by about  
>> 50%, but it
>> seems to peak.
>> 2. Put the entire index on a RAM disk, and significantly reduced the

>> average
>> response time, but my throughput didn't change (i.e. even though my  
>> response
>> time was 10X faster, the maximum number of requests I could make per

>> second
>> didn't increase). This makes no sense to me, unless there is another

>> bottle
>> neck somewhere.
>> 3. Reduced the number of records in my index. The throughput  
>> increased, but
>> the shape of all my graphs stayed the same, and my CPU usage was  
>> identical.
>>
>> I have a few questions:
>> 1. Can I get more than 50% CPU utilization?
>> 2. Why does CPU utilization fall when I make more than 8 concurrent
>> requests?
>> 3. Is there an obvious bottleneck that I'm missing?
>> 4. Does Tomcat have any settings that affect Solr performance?
>>
>> Any input is greatly appreciated.
>>
>> -- 
>> View this message in context:
>>
http://www.nabble.com/Throughput-Optimization-tp20335132p20335132.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context:
http://www.nabble.com/Throughput-Optimization-tp20335132p20343425.html
Sent from the Solr - User mailing list archive at Nabble.com.



Mime
View raw message