lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Corrupt segments file full of zeros
Date Wed, 29 Jun 2011 10:54:45 GMT
On Tue, Jun 28, 2011 at 10:45 PM, Trejkaz <trejkaz@trypticon.org> wrote:
> On Wed, Jun 29, 2011 at 2:24 AM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>> Here's the issue:
>>
>>    https://issues.apache.org/jira/browse/LUCENE-3255
>>
>> It's because we read the first 0 int to be an ancient segments file
>> format, and the next 0 int to mean there are no segments.  Yuck!
>>
>> This format pre-dates Lucene 1.9, so the fix for 3.x is to stop
>> supporting this ancient format... but I don't see any easy way to fix
>> this pre-3.x where we must (by our back compat rules) support such an
>> ancient index.
>
> It's not possible to do something based on the existence of further
> zeroes after the first 8 bytes?  I would expect the original format to
> have no additional data after that, but I don't exactly know whether a
> corrupt file could be exactly 8 bytes long...

Yes, you're right, it is!  That would work, as long as the all 0s file
isn't exactly 8 bytes long (this time yours was 20).  But then we are
still vulnerable if the corruption just happens to produce an 8 byte
all 0s file...

Simon also had a good idea, which is to check the version of the prior
segments file, and refuse to accept this ancient version of the newer
segments if the prior one is "modern".

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message