lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil W <phi...@volantis.com>
Subject Re: Lucene index got corrupted
Date Wed, 24 May 2006 09:56:43 GMT
Stas Chetvertkov <schetvertkov <at> oilspace.com> writes:
> 
> Hi All,
> 
> We are using lucene for indexing realtime news, and everything was working
> fine until now. I have found that one of our lucene indexes is corrupted,
> all attempts to search in it / optimize it or merge it to another index
> results in 'read past EOF' exception.
> 
> My investigation showed that one of segments in index seems invalid. Its
> field index file, '_4lvd.fdx', has length equal to 24 bytes, while all field
> normalization factor files (_4lvd.f?) are 2 bytes long. Since number of
> documents is determined as length('_4lvd.fdx')/8, lucene tries to read 3rd
> byte from normalization factor files and fails.
> 
> Does anyone have any ideas how this index corruption could occur and how I
> can fix it? Any advise would be extremely helpful.

Did you ever get anywhere with this? We seem to be having issues with some
character data causing corruptions in a similar manner (we haven't tracked it
down or verified which characters cause this, though a tab character might be
one of the offendees).



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message