lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stas Chetvertkov" <>
Subject Lucene index got corrupted
Date Wed, 06 Nov 2002 09:22:56 GMT
Hi All,

We are using lucene for indexing realtime news, and everything was working
fine until now. I have found that one of our lucene indexes is corrupted,
all attempts to search in it / optimize it or merge it to another index
results in 'read past EOF' exception.

My investigation showed that one of segments in index seems invalid. Its
field index file, '_4lvd.fdx', has length equal to 24 bytes, while all field
normalization factor files (_4lvd.f?) are 2 bytes long. Since number of
documents is determined as length('_4lvd.fdx')/8, lucene tries to read 3rd
byte from normalization factor files and fails.

Does anyone have any ideas how this index corruption could occur and how I
can fix it? Any advise would be extremely helpful.

Here is an exception that I get when trying to search in this index:
Exception in thread "main" read past EOF
        at Source)
        at Source)
        at Source)
        at org.apache.lucene.index.SegmentReader.norms(Unknown Source)
        at org.apache.lucene.index.SegmentsReader.norms(Unknown Source)
        at Source)
        at Source)
        at Source)
        at Source)
        at<init>(Unknown Source)
        at Source)
        at Source)
        at Search.main(

I am also attaching archived segment that is causing problems.


View raw message