lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Gearon <gear...@sbcglobal.net>
Subject RE: Hardware Specs Question
Date Mon, 06 Sep 2010 20:01:51 GMT
Very interesting stuff!

I'm pretty sure everything will be non hard disk for intense applications FRONT line use by
10 years or sooner, with hard disk as backup/boot up.

Dennis Gearon

Signature Warning
----------------
EARTH has a Right To Life,
  otherwise we all die.

Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php


--- On Mon, 9/6/10, Toke Eskildsen <te@statsbiblioteket.dk> wrote:

> From: Toke Eskildsen <te@statsbiblioteket.dk>
> Subject: RE: Hardware Specs Question
> To: "Dennis Gearon" <gearond@sbcglobal.net>, "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
> Date: Monday, September 6, 2010, 12:35 PM
> From: Dennis Gearon [gearond@sbcglobal.net]:
> > I wouldn't have thought that CPU was a big deal with
> the speed/cores of CPU's
> > continuously growing according to Moore's law and the
> change in Disk Speed
> > barely changine 50% in 15 years. Must have a lot to do
> with caching.
> 
> I am not sure I follow you? When seek times are suddenly a
> 100 times faster (slight exaggeration, but only slight) why
> wouldn't it cause the bottleneck to move? Yes, CPU's has
> increased tremendously in speed, but so has our processing
> needs. Lucene (and by extension Solr) was made with long
> seek times in mind and looking at the current marked, it
> makes sense to continue supporting this for some years. If
> the software was optimized for sub-ms seek times, it might
> lower CPU usage or at the very least lower the need for
> caching (internal as well as external).
> 
> > What size indexes are you working with?
> 
> Around 40GB for our primary index. 9 million documents,
> AFAIR.
> 
> > Are you saying you can get the whole thing in memory?
> 
> No. For that test we had to reduce the index to 14GB on our
> 24GB test machine with Lucene's RAMDirectory. In order to
> avoid the "everything is cached and thus everything is the
> same speed"-problem, we lowered the amount of available
> memory to 3GB when we measured harddisk & SSD speed
> against the 14GB index. The Cliff notes is harddisks 200 raw
> queries/second, SSDs 774 q/sec and RAM 952 q/s, but as
> always it is not so simple to extract a single number for
> performance when warm up and caching comes into play. Let me
> be quick to add that this was with Lucene + custom code, not
> with Solr.
> 
> > That would negate almost any disk benefits.
> 
> That depends very much on your setup. It takes a fair
> amount of time to copy 14GB from storage into RAM so an
> index fully in RAM would either be very static or require
> some logic to handle updates and sync data in case of
> outages. I know there's some interesting work being done
> with this, but as SSDs are a lot cheaper than RAM and
> fulfill our needs, it is not something we pursue.
> 

Mime
View raw message