lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Monsur Hossain" <mons...@gmail.com>
Subject Re: Changing the MergeFactor - should I reindex?
Date Fri, 30 Jun 2006 21:26:25 GMT
Thanks Otis.

When two segments are merged into one, are they merged contiguously,
so that the individual .del files for the individual segments will no
longer be needed?  Or are the .del files merged into a single larger
.del file for the new segment?

Monsur



On 6/30/06, Otis Gospodnetic <otis_gospodnetic@yahoo.com> wrote:
> Hi Monsur,
>
> You don't need to reindex everything after changing merge factor.  As for growing the
.del file, I believe the .del file simply keeps track of documents that have been deleted
(and should thus not show up in search results).  Thus, I think you can let the file grow,
although at some point it may make sense to optimize everything - I'm not sure how much reader.isDeleted(docNum)
calls affect performance when the number of documents gets large.... but you'll find out :)
 And when you do, all you'll need to do is optimize the index on the indexing server and push
it to the search server.
>
> Otis
>
> ----- Original Message ----
> From: Monsur Hossain <monsurh@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Friday, June 30, 2006 3:39:27 PM
> Subject: Changing the MergeFactor - should I reindex?
>
> I have a system of 2 servers, one to index and one to search.  The
> index server updates the Lucene index and then copies the 200 meg
> index over to the search server.  Originally, the index server would
> optimize the index before copying.  To improve performance, I stopped
> optimizing, dropped the mergefactor down to 2, and then copied only
> the changed segments to the search server (most of the runtime was
> taken up optimizing and copying the index; this change dropped the
> runtime from 2 minutes down to 2 seconds!).
>
> After switching to this system, I've noticed that the .del file for
> the original 200 meg segment is growing pretty large (I'm updating
> every 2 minutes, and there are always some deletions/additions).
>
> My question is this: Am I fine letting this .del file grow, or should
> I reindex the entire index with a mergefactor of 2 (to better
> distribute the deletions)?
>
> Monsur
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message