lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Itamar Syn-Hershko <ita...@code972.com>
Subject Corrupt index
Date Wed, 13 Jun 2012 00:20:18 GMT
Hi Java devs,

I'm a Lucene.Net committer, and there is a chance we have a bug in our
FSDirectory implementation that causes indexes to get corrupted when
indexing is cut while the IW is still open. As it roots from some
retroactive fixes you made, I'd appreciate your feedback.

Correct me if I'm wrong, but by design Lucene should be able to recover
rather quickly from power failures or app crashes. Since existing segment
files are read only, only new segments that are still being written can get
corrupted. Hence, recovering from worst-case scenarios is done by simply
removing the write.lock file. The worst that could happen then is having
the last segment damaged, and that can be fixed by removing those files,
possibly by running CheckIndex on the index.

Last week I have been playing with rather large indexes and crashed my app
while it was indexing. I wasn't able to open the index, and Luke was even
kind enough to wipe the index folder clean even though I opened it in
read-only mode. I re-ran this, and after another crash running CheckIndex
revealed nothing - the index was detected to be an empty one. I am not
entirely sure what could be the cause for this, but I suspect it has
been corrupted by the crash.

I've been looking at these:

https://issues.apache.org/jira/browse/LUCENE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
https://issues.apache.org/jira/browse/LUCENE-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

And it seems like this is what I was experiencing. Mike and Mark will
probably be able to tell if this is what they saw or not, but as far as I
can tell this is not an expected behavior of a Lucene index.

What I'm looking for at the moment is some advice on what FSDirectory
implementation to use to make sure no corruption can happen. The 3.4
version (which is where LUCENE-3418 was committed to) seems to handle a lot
of things the 3.0 doesn't, but on the other hand LUCENE-3418 was introduced
by changes made to the 3.0 codebase.

Also, is there any test in the suite checking for those scenarios?

Will appreciate any help on this,

Itamar.

Mime
View raw message