lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil W <>
Subject Re: Lucene index got corrupted
Date Wed, 24 May 2006 09:56:43 GMT
Stas Chetvertkov <schetvertkov <at>> writes:
> Hi All,
> We are using lucene for indexing realtime news, and everything was working
> fine until now. I have found that one of our lucene indexes is corrupted,
> all attempts to search in it / optimize it or merge it to another index
> results in 'read past EOF' exception.
> My investigation showed that one of segments in index seems invalid. Its
> field index file, '_4lvd.fdx', has length equal to 24 bytes, while all field
> normalization factor files (_4lvd.f?) are 2 bytes long. Since number of
> documents is determined as length('_4lvd.fdx')/8, lucene tries to read 3rd
> byte from normalization factor files and fails.
> Does anyone have any ideas how this index corruption could occur and how I
> can fix it? Any advise would be extremely helpful.

Did you ever get anywhere with this? We seem to be having issues with some
character data causing corruptions in a similar manner (we haven't tracked it
down or verified which characters cause this, though a tab character might be
one of the offendees).

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message