lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: [lucy-user] When to use optimize()
Date Tue, 27 Nov 2012 00:08:31 GMT
On Sun, Nov 25, 2012 at 12:08 PM, Thomas den Braber <thomas@delos.nl> wrote:
> Reading through the documentation I could not find info about when to
> optimize.

Every Indexer session which changes index content and ends in a commit()
creates a new segment.  Once written, segments are never modified.  However,
they are periodically recycled by feeding their content into the segment
currently being written.

The optimize() method causes all existing index content to be fed back into
the Indexer.  When commit() completes after an optimize(), the index will
consist of one segment.

Several years ago, there was a significant search-time performance benefit to
collapsing down to a single segment versus even two segments, as a
single-segment index allowed searches to avoid an extra level of indirection
affecting many, many low-level operations.  That changed back in 2008-2009
when both Lucy and Lucene were modified to use per-segment searching all the
time, eliminating the extra indirection.  Now the effect of collapsing to a
single segment is much less significant, and calling optimize() is rarely
justified.

(For another perspective on segment recycling, see
Lucy::Docs::Cookbook::FastUpdates.)

> Is optimize needed after creating a fresh index from scratch ?

No, since a fresh index created in one Indexer session will be a single
segment.

> Do I have to call optimize() before or after commit() ?

Before.

Marvin Humphrey

Mime
View raw message