lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felipe Lobo <fel...@goshme.com>
Subject Re: Maximum index file size
Date Fri, 23 Oct 2009 18:12:54 GMT
Hi, interesting discussion.
Supose my index now has 1 TB. I splitted into 16 hds (65GB per hd) in the
same machine with 16 cores.
Use parallelmultisearch it's a good idea for this structure?? Results wil be
fast?? Is there a better solution for this structure?

thanks,

On Fri, Oct 23, 2009 at 9:33 AM, Toke Eskildsen <te@statsbiblioteket.dk>wrote:

> On Fri, 2009-10-23 at 08:49 +0200, Jake Mannix wrote:
> >   One of the big problems you'll run into with this index size is that
> > you'll never have enough RAM to give your OS's IO cache enough room to
> keep
> > much of this index in memory, so you're going to be seeking in this
> monster
> > file a lot. [...]
>
> Solid State Drives helps a lot in this aspect. We've done experiments
> with a 40GB index and adjustments of the amount of RAM available for
> file cache. We observed that search-speed using SSD's weren't near as
> susceptible to cache-size as conventional harddisks.
>
> Some quick and fairly unstructured notes on our observations:
> http://wiki.statsbiblioteket.dk/summa/Hardware
>
> > [...]
> > This may be mitigated by using really fast disks, possibly, which is yet
> > another reason why you'll need to do some performance profiling on a
> > variety of sizes with similar-to-production data sets.
>
> For our setup, a switch from conventional harddisks to SSDs moved the
> bottleneck from I/O to CPU/RAM.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Felipe Lobo
www.jusbrasil.com.br

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message