lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shlomit Rosen <SHLOM...@il.ibm.com>
Subject Index corruption with lucene 3.0.3
Date Wed, 17 Dec 2014 09:42:13 GMT
Hello, 

We have a client that is using lucene 3.0.3. 
They  are working with NAS storage device which recently had permission 
issues,
which might have generated some "out of disk space" exceptions during 
indexing.
We are uncertain if they also suffered JDK crashes in the past few months, 
as we 
discovered dmp files and javacores on their system. 

Consequently, they now have 3 corrupted indices. 
All of them show a similar issue: 

java.io.IOException: No sub-file with id _xv.fnm found
        at 
org.apache.lucene.index.CompoundFileReader.openInput(CompoundFileReader.java:137)
        at 
org.apache.lucene.index.CompoundFileReader.openInput(CompoundFileReader.java:125)
        at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:68)
        at 
org.apache.lucene.index.SegmentReader$CoreReaders.<init>(SegmentReader.java:120)
        at 
org.apache.lucene.index.SegmentReader.get(SegmentReader.java:605)
        at 
org.apache.lucene.index.SegmentReader.get(SegmentReader.java:583)
        at 
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:470)
        at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:883)


Looking at the indices file listing, I see that this file (i.e. - _xv.fnm) 
is really missing, 
but I also see that a compound file with the same name exist on disk (i.e. 
- _xv.cfs). 

My question is - 
        is there a way to "save" the collection by re-creating the fnm 
file from the cfs file (or in any other way...?)
        Or does our client need to re-index the entire collection? 
(Assuming the checkIndex -fix option is no good, because we cannot know 
which documents are lost...)

I'm attaching the checkIndex output as reference

Thanks in advance!
Shlomit



Mime
View raw message