lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <>
Subject Re: Re: optimized disk usage when creating a compound index
Date Tue, 10 Aug 2004 06:42:01 GMT

Dmitry Serebrennikov <> schrieb am 09.08.2004,

> Well, I think this could work, but I'm not sure how this will behave if 
> an IndexReader is created on the new segment while it is still 
> uncompound. Then when you try to delete the individual files, you'd have 
> to implement something like "deletable" file for segments (to work with 
> Windows file locking).

That's right. I would use the deletable mechanism in IndexWriter
to delete the non-compound files of the index after creation of
the compound file. That's step 4 of my last mail. It would be done
within a commit lock.

> Anyway, what do you think of the original way proposed by Bernard? I 
> think that method was ok. If I understand correctly, in that method the 
> merge process does not end until compound file is created (as before), 
> but the files are deleted as they are merged in. I suppose there is a 
> chance that the compound file creation process fails and we would not 
> have any new segment since the files that were useable would have been 
> half deleted. Is that what's bothering you in this solution? To me this 
> seems acceptable because it shouldn't happen frequently. What do you 
> think? Is there anything I'm missing about Bernard's solution?

In Bernhard's solution the old segments that have been merged into the
new segment are still there while building the compound file. Disk space
is saved by deleting the non-compound files of the new segment earlier
than in the original implementation, immediately after copying them into
the compound file. However, usually there are 1 to 3 big files in a
segment. So the advantage in disk space is not as big as it could be
with my solution. Individual files still exist in 3 copies for a short
period of time (while they are copied). 

Furthermore, deleting files in CompoundFilerWriter.close is not
what I would expect from a CompoundFilerWriter. But since it is only
used in SegmentMerger, it's ok.

I am not insisting on my solution. I was just about to commit Bernhard's
solution on Sunday, but then I thought it could be done better....

Now I am not sure what to do. I am still a little bit in favour of my
idea, but not so much....
> (By the way, Thanks for helping to maintain and improve this code!)
> Dmitry.

I think we are all doing this because it's fun and there is such a
great community immediately looking at, testing and reviewing our work.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message