lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Question on writer optimize() / file merging?
Date Mon, 17 Jan 2011 12:55:12 GMT
See below:

On Sun, Jan 16, 2011 at 10:15 AM, sol myr <solmyr72@yahoo.com> wrote:

> Hi,
>
> I'm trying to understand the behavior of file merging / optimization.
> I see that whenever my IndexWriter calls 'commit()', it creates a new file
> (or fileS).
> I also see these files merged when calling 'optimize()' , as much as
> allowed by the parameter 'NoCFSRatio' .
>
> But I'm still trying to figure out:
>
> 1) Will my writer still perform some file merging, even if I don't
> explicitly call 'optimize()'?
>
>
Yes. The merge factor controls this so you don't have a huge number of
files. There are some
nifty diagrams floating around on the net, but I don't have one right at
hand...



> 2) Is there a way to configure the number or files, or their size?
>
> IndexWriter.setMergeFactor controls the number of segments. There's no way
I know
of to control by size however.

> 3) I always keep an open IndexSearcher (and IndexReader). I know they
> should be re-opened when a change occurs, but it's not crucial to see
> changes immediately, so I just poll periodically, and it might be a few
> minutes before my reader is re-opened and allowed to see changes.
> But will this approach disturb the writer's ability to optimize / merge
> files? If a reader is open, will it prevent file merging?
>
>
No, this is a fine approach. Lucene index segments are never changed. A
merge will #copy# the
segments being merged to a new segment and when you open a new reader it
will look at the new
segment while the old reader merrily looks at the old segments. This is why
the disk space may
double during a merge.

Best
Erick


> Thanks
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message