lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arvind Srinivasan <luc...@ziplip.com>
Subject Potential Segment corruption
Date Thu, 26 May 2005 16:13:31 GMT
Hi,
We have seen Lucene segments corrupt, under the following situation:
During merging of segments, the following sequence of operations takes place
  (1) Locks index
  (2) get new segment name by calling newSegmentName() which basically will
      call segmentInfos.counter++
  (3) Data is written to the new Segments
  (4) Segment File is rewritten.
  (5) Old segments are deleted/marked for deletion.
The corruption is a possiblity when an exception ocurrs on step (3)
preventing the Commit to the segments file. Eg: No disk space, 
loose network share etc, Bad Merging segments etc. 
Because the segment files are not replaced there is no corruption immediately,
however. on the next merge operation, the index will corrupt.  [There is an 
scenario where the corruption may not occur, if the new segment 
is bigger than the failed one.].  I am not sure the effect of this on Compound File Store.
The cause of this issue can be traced to segmentInfos.counter. Because the counter 
is not changed in the segments file, the next merge operation will use the same
failed segment name, and if you are using any standard Directory implementation, 
it will probably write the segment to the same file location. Note the merge operations
opens the segments in read-write mode and therefore we start with a non-empty file.
Some options are: 
(1)Commit the counter after the newSegmentName call. This way we never reuse the
the segmentName.
(2)  Add a callback API to directory interface for a new Segment Creation allowing
the directory interface to clean up, on a new segment write.
(3)  Provide a Rollback mechanism in the event of merge failure. (Using the deleteable
     functionality).
(4) For Compound File Store (The file must be empty). (Possibly, it can use the callback 
     in  (2) to cleanup.
We should apply as many of the them to make the merge code robust to potential failures:
I think with the increase adoption of Lucene, we need to think about data corruption
and recovery issues. More later,

Arvind.
     

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message