lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <>
Subject Re: detected corrupted index / performance improvement
Date Wed, 06 Feb 2008 23:22:59 GMT
That doesn't help, with lazy writing/buffering by the OS, there is no  
guarantee that if the last written block is ok, that earlier blocks  
in the file are....

The OS/drive is going to physically write them in the most efficient  
manner. Only after a sync would this hold true (which is what we are  
trying to avoid).

On Feb 6, 2008, at 5:15 PM, DM Smith wrote:

> On Feb 6, 2008, at 5:42 PM, Michael McCandless wrote:
>> robert engels wrote:
>>> Do we have any way of determining if a segment is definitely OK/ 
>>> VALID ?
>> The only way I know is the CheckIndex tool, and it's rather slow (and
>> it's not clear that it always catches all corruption).
> Just a thought. It seems that the discussion has revolved around  
> whether a crash or similar event has left the file in an  
> inconsistent state. Without looking into how it is actually done,  
> I'm going to guess that the writing is done from the start of the  
> file to its end. That is, no "out of order" writing.
> If this is the case, how about adding a marker to the end of the  
> file of a known size and pattern. If it is present then it is  
> presumed that there were no errors in getting to that point.
> Even with out of order writing, one could write an 'INVALID' marker  
> at the beginning of the operation and then upon reaching the end of  
> the writing, replace it with the valid marker.
> If neither marker is found then the index is one from before the  
> capability was added and nothing can be said about the validity.
> -- DM
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message