lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From evgeniy.stro...@yahoo.com
Subject Re: Cache use
Date Tue, 04 Dec 2007 19:52:59 GMT
Any suggestions are helpful to me,. even general.. Here is the info from my index:
How big is the index on disk (the most important files are .frq,  
and .prx if you do phrase queries?  
- Total index folder size is 30.7 Gb
- .frq is 12.2 Gb
- .prx is 6 Gb
 
How big and what exactly is a record in your system?  
- record is a document with 100 fields indexed and 10 of them stored. Approximately 60% of
fields are containing data.
 
Do you do faceting/sorting?  
- Yes, I'm planing to do both.
 
How much memory do you have?  
- I do have 8Gb of RAM I could get up to 16Gb
 
What does a typical query look like?
- I don't know yet. We are in prototype mode. We try everything possible. In general we are
able to get results in sub-second. But some queries take long, for example TOWN:L* I know
this is very broad query, and probably the worst one. But we could need such queries to get
quantity of such towns with name starting with "L", for example. Cache helps a little, for
example after this query if I run TOWN:La* I'm getting result in milliseconds.
But what wonders me is: if I'm running query like this: TOWN:L* OR STREET:S* I'm guessing
it should cache all data of this set. If after I run just TOWN:L* , which is subset of the
first query, it still takes time to get the result back, as if it's not cached.. 


----- Original Message ----
From: Mike Klaas <mike.klaas@gmail.com>
To: solr-user@lucene.apache.org
Sent: Tuesday, December 4, 2007 2:33:24 PM
Subject: Re: Cache use

On 4-Dec-07, at 8:43 AM, Evgeniy Strokin wrote:

> Hello,...
> we have 110M records index under Solr. Some queries takes a while,  
> but we need sub-second results. I guess the only solution is cache  
> (something else?)...
> We use standard LRUCache. In docs it says (as far as I understood)  
> that it loads view of index in to memory and next time works with  
> memory instead of hard drive.
> So, my question: hypothetically, we can have all index in memory if  
> we'd have enough memory size, right? In this case the result should  
> come up very fast. We have very rear updates. So I think this could  
> be a solution.

How big is the index on disk (the most important files are .frq,  
and .prx if you do phrase queries?  How big and what exactly is a  
record in your system?  Do you do faceting/sorting?  How much memory  
do you have?  What does a typical query look like?

Performance is a tricky subject.  It is hard to give any kind of  
useful answer that applies in general.  The one thing I can say is  
that 110M is a _lot_ of docs for one system, especially if these are  
normal-sized documents

regards,
-Mike
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message