lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <>
Subject Re: detected corrupted index / performance improvement
Date Fri, 08 Feb 2008 18:04:43 GMT
But that would mean we should be using at least 250k buffers for the  
IndexInput ? Not the 16k or so that is the default.

Is the OS smart enough to figure out that the file is being  
sequentially read, and adjust its physical read size to 256k, based  
on the other concurrent IO operations. Seems this would be hard for  
it to figure out, and have it not perform poorly in the general case.

On Feb 8, 2008, at 11:25 AM, Doug Cutting wrote:

> Michael McCandless wrote:
>> Merging is far more IO intensive.  With mergeFactor=10, we read from
>> 40 input streams and write to 4 output streams when merging the
>> tii/tis/frq/prx files.
> If your disk can transfer at 50MB/s, and takes 5ms/seek, then 250kB  
> reads and writes are the break-even point, where half the time is  
> spent seeking and half transferring, and throughput is 25MB/s.   
> With 44 files open, that means the OS needs just 11MB of buffering  
> to keep things above this threshold.  Since most systems have  
> considerably larger buffer pools than 11MB, merging with  
> mergeFactor=10 shouldn't be seek-bound.
> Doug
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message