lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Feak, Todd" <Todd.F...@smss.sony.com>
Subject RE: Throughput Optimization
Date Wed, 05 Nov 2008 19:29:04 GMT
Yonik said something about the FastLRUCache giving the most gain for
high hit-rates and the LRUCache being faster for low hit-rates. It's in
his Nov 1 comment on SOLR-667. I'm not sure if anything changed since
then, as it's an active issue, but you may want to try the LRUCache for
your query cache.

It sounds like you are memory bound already, but you may want to
investigate the tradeoffs of your filter cache vs. document cache. High
document hit-rate was a big performance boost for us, as document
garbage collection is a lot of overhead. I believe that would show up as
CPU usage though, so it may not be your bottleneck.

This also brings up an interesting question. 3% hit rate on your query
cache seems low to me. Are you sure your load test is mimicking
realistic query patterns from your user base? I realize this probably
isn't part of your bottleneck, just curious.

-Todd Feak

-----Original Message-----
From: wojtekpia [mailto:wojtek_p@hotmail.com] 
Sent: Wednesday, November 05, 2008 11:08 AM
To: solr-user@lucene.apache.org
Subject: RE: Throughput Optimization


My documentCache hit rate is ~.7, and my queryCache is ~.03. I'm using
FastLRUCache on all 3 of the caches.


Feak, Todd wrote:
> 
> What are your other cache hit rates looking like?
> Which caches are you using the FastLRUCache on?
> 
> -Todd Feak
> 
> -----Original Message-----
> From: wojtekpia [mailto:wojtek_p@hotmail.com] 
> Sent: Wednesday, November 05, 2008 8:15 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Throughput Optimization
> 
> 
> Yes, I am seeing evictions. I've tried setting my filterCache higher,
> but
> then I start getting Out Of Memory exceptions. My filterCache hit
ratio
> is >
> .99. It looks like I've hit a RAM bound here.
> 
> I ran a test without faceting. The response times / throughput were
both
> significantly higher, there were no evictions from the filter cache,
but
> I
> still wasn't getting > 50% CPU utilization. Any thoughts on what
> physical
> bound I've hit in this case?
> 
> 
> 
> Erik Hatcher wrote:
>> 
>> One quick question.... are you seeing any evictions from your  
>> filterCache?  If so, it isn't set large enough to handle the faceting
> 
>> you're doing.
>> 
>> 	Erik
>> 
>> 
>> On Nov 4, 2008, at 8:01 PM, wojtekpia wrote:
>> 
>>>
>>> I've been running load tests over the past week or 2, and I can't  
>>> figure out
>>> my system's bottle neck that prevents me from increasing throughput.
> 
>>> First
>>> I'll describe my Solr setup, then what I've tried to optimize the  
>>> system.
>>>
>>> I have 10 million records and 59 fields (all are indexed, 37 are  
>>> stored, 17
>>> have termVectors, 33 are multi-valued) which takes about 15GB of  
>>> disk space.
>>> Most field values are very short (single word or number), and  
>>> usually about
>>> half the fields have any data at all. I'm running on an 8-core, 64- 
>>> bit, 32GB
>>> RAM Redhat box. I allocate about 24GB of memory to the java process,
> 
>>> and my
>>> filterCache size is 700,000. I'm using a version of Solr between 1.3
> 
>>> and the
>>> current trunk (including the latest SOLR-667 (FastLRUCache) patch),

>>> and
>>> Tomcat 6.0.
>>>
>>> I'm running a ramp-test, increasing the number of users every few  
>>> minutes. I
>>> measure the maximum number of requests that Solr can handle per  
>>> second with
>>> a fixed response time, and call that my throughput. I'd like to see

>>> a single
>>> physical resource be maxed out at some point during my test so I  
>>> know it is
>>> my bottle neck. I generated random queries for my dataset  
>>> representing a
>>> more or less realistic scenario. The queries include faceting by up

>>> to 6
>>> fields, and quering by up to 8 fields.
>>>
>>> I ran a baseline on the un-optimized setup, and saw peak CPU usage  
>>> of about
>>> 50%, IO usage around 5%, and negligible network traffic.  
>>> Interestingly, the
>>> CPU peaked when I had 8 concurrent users, and actually dropped down

>>> to about
>>> 40% when I increased the users beyond 8. Is that because I have 8  
>>> cores?
>>>
>>> I changed a few settings and observed the effect on throughput:
>>>
>>> 1. Increased filterCache size, and throughput increased by about  
>>> 50%, but it
>>> seems to peak.
>>> 2. Put the entire index on a RAM disk, and significantly reduced the
> 
>>> average
>>> response time, but my throughput didn't change (i.e. even though my

>>> response
>>> time was 10X faster, the maximum number of requests I could make per
> 
>>> second
>>> didn't increase). This makes no sense to me, unless there is another
> 
>>> bottle
>>> neck somewhere.
>>> 3. Reduced the number of records in my index. The throughput  
>>> increased, but
>>> the shape of all my graphs stayed the same, and my CPU usage was  
>>> identical.
>>>
>>> I have a few questions:
>>> 1. Can I get more than 50% CPU utilization?
>>> 2. Why does CPU utilization fall when I make more than 8 concurrent
>>> requests?
>>> 3. Is there an obvious bottleneck that I'm missing?
>>> 4. Does Tomcat have any settings that affect Solr performance?
>>>
>>> Any input is greatly appreciated.
>>>
>>> -- 
>>> View this message in context:
>>>
> http://www.nabble.com/Throughput-Optimization-tp20335132p20335132.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/Throughput-Optimization-tp20335132p20343425.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 
> 

-- 
View this message in context:
http://www.nabble.com/Throughput-Optimization-tp20335132p20346663.html
Sent from the Solr - User mailing list archive at Nabble.com.



Mime
View raw message