Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-user@lucene.apache.org
Received-SPF: neutral (athena.apache.org: local policy)
MIME-Version: 1.0
In-Reply-To: <1256301220.7373.1164.camel@pc286>
References: 
 <E3137F97B0A0804194F369F4E7C3B71929AE7C8F75@EXCHANGE.persistent.co.in>
	 <4b124c310910222239k7d1de1c1q8cb7c54165ad2f89@mail.gmail.com>
	 <E3137F97B0A0804194F369F4E7C3B71929AE7C91E6@EXCHANGE.persistent.co.in>
	 <4b124c310910222349n6906f9dcw4491949caf4d26c7@mail.gmail.com>
	 <1256301220.7373.1164.camel@pc286>
Date: Fri, 23 Oct 2009 15:12:54 -0300
Message-ID: <c4e715d70910231112y5c6e2ceex97cedec4dad91493@mail.gmail.com>
Subject: Re: Maximum index file size
From: Felipe Lobo <felipe@goshme.com>
To: java-user@lucene.apache.org
Content-Type: multipart/alternative; boundary=000e0cd5f15c301e2204769e2980

--000e0cd5f15c301e2204769e2980
Content-Type: text/plain; charset=ISO-8859-1

Hi, interesting discussion.
Supose my index now has 1 TB. I splitted into 16 hds (65GB per hd) in the
same machine with 16 cores.
Use parallelmultisearch it's a good idea for this structure?? Results wil be
fast?? Is there a better solution for this structure?

thanks,

On Fri, Oct 23, 2009 at 9:33 AM, Toke Eskildsen <te@statsbiblioteket.dk>wrote:

> On Fri, 2009-10-23 at 08:49 +0200, Jake Mannix wrote:
> >   One of the big problems you'll run into with this index size is that
> > you'll never have enough RAM to give your OS's IO cache enough room to
> keep
> > much of this index in memory, so you're going to be seeking in this
> monster
> > file a lot. [...]
>
> Solid State Drives helps a lot in this aspect. We've done experiments
> with a 40GB index and adjustments of the amount of RAM available for
> file cache. We observed that search-speed using SSD's weren't near as
> susceptible to cache-size as conventional harddisks.
>
> Some quick and fairly unstructured notes on our observations:
> http://wiki.statsbiblioteket.dk/summa/Hardware
>
> > [...]
> > This may be mitigated by using really fast disks, possibly, which is yet
> > another reason why you'll need to do some performance profiling on a
> > variety of sizes with similar-to-production data sets.
>
> For our setup, a switch from conventional harddisks to SSDs moved the
> bottleneck from I/O to CPU/RAM.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Felipe Lobo
www.jusbrasil.com.br

--000e0cd5f15c301e2204769e2980--