lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject Re: detected corrupted index / performance improvement
Date Fri, 08 Feb 2008 17:25:08 GMT
Michael McCandless wrote:
> Merging is far more IO intensive.  With mergeFactor=10, we read from
> 40 input streams and write to 4 output streams when merging the
> tii/tis/frq/prx files.

If your disk can transfer at 50MB/s, and takes 5ms/seek, then 250kB 
reads and writes are the break-even point, where half the time is spent 
seeking and half transferring, and throughput is 25MB/s.  With 44 files 
open, that means the OS needs just 11MB of buffering to keep things 
above this threshold.  Since most systems have considerably larger 
buffer pools than 11MB, merging with mergeFactor=10 shouldn't be seek-bound.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message