lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Fox <s...@carleton.edu>
Subject Re: Solr RPS is painfully low
Date Thu, 03 Jan 2008 01:20:45 GMT
Are you (or have you tried) breaking these queries up as a set of  
filter queries?

fq=gender:f&fq=( friends:y )&fq= country:us&fq= age:(18 || 19 || 20  
|| 21)&fq=photos:y

(mod correct syntax)

Should get you the same result but each fq is cached separately as a  
bitset and future queries that have similar limits (gender:f)
will take advantage of the bitset rather than having to do the actual  
query.

Don't know if this applies to your situation, but it might help a lot.

Sean

On Jan 2, 2008, at 6:09 PM, Alex Benjamen wrote:

> Hello,
>
> I have a situation where I'm using solr with a 3Gb complete index  
> (in ram) on a dual-core
> AMD machine, and I'm only getting about 1.3rps on cold queries  
> (which for most part there
> is little chance for the query to be identical)
>
> Is this normal? The index contains about 20MM documents and I have  
> 16Gb RAM. When I
> perform the load test the CPU hits 100%.
>
> Here's a typical query:
> gender:f AND ( friends:y )  AND  country:us AND age:(18 || 19 || 20  
> || 21) AND photos:y
>
> On average the result set is a few hundred thousand - is there any  
> way to optimize such a query
> or how can I get a better RPS? 1.3 rps is way too low for an index  
> that fits completely into RAM
>
> So after some thoughts on how to reduce the number of the documents  
> in the index (which is the single
> biggest factor in CPU usage) I've decided to split up the users by  
> country, which gives me a somewhat
> uneven distribution of users. Even after doing this, I'm getting  
> only about 8 RPS across 5 solr instances
> running on different ports (with all of the indexes in RAM) Each  
> index contains between 4-8 MM docs.
> I guess I could go with quad core and get 16RPS, but the question  
> that comes to my mind is whether
> this is an acceptable RPS for the size of index. (The total  
> physical index size is around 1.3GB which
> is all on a ramdisk in memory). I'm positive that I can get a  
> better RPS by "splitting" the index further,
> into smaller document sets, but this is undesired as it limits  
> functionality
>
> Note: due to the nature of the search which I'm doing, it's very  
> inlikely that I will be able to achive
> more than 20% cache hit ratio in the queryResultCache
>
> Thanks,
> -Alex
>
>


Mime
View raw message