lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: best way to ensure IndexWriter won't corrupt the index?
Date Wed, 25 Nov 2009 15:49:59 GMT
Before 2.4 it was possible that a crash of the OS, or sudden power
loss to the machine, could corrupt the index.  But that's been fixed
with 2.4.

The only known sources of corruption are hardware faults (bad RAM, bad
disk, etc.), and, accidentally allowing 2 writers to write to the same
index at once (this will very quickly cause corruption).  Lucene's
write lock normally prevents this from happening.

kill -9, JRE crashing, OS crashing, power loss, etc., should not cause
corruption for Lucene >= 2.4.

Backing up is definitely a good idea -- Lucene's
SnapshotDeletionPolicy makes it easy to do a hot backup (backup even
though IndexWriter is still changing the index).  There's a paper on
this (NOTE: I'm the author!) available  at that gives details (look for "Hot
backups with Lucene (green paper - html)".


On Wed, Nov 25, 2009 at 10:37 AM, Max Lynch <> wrote:
> On Wed, Nov 25, 2009 at 9:31 AM, Ian Lea <> wrote:
>> > What are the typical scenarios when the index will go corrupt?
>> Dodgy disks.
> I also have had index corruption on two occasions.  It is not a big deal for
> me since my data is fairly real time so the old documents aren't as
> important.
> However, I'm running this on a VPS with slicehost, so whether or not they
> use dodgy disks is not something I can confirm or even deal with.
> I do need to upgrade to 2.9 from 2.4, but I think one of the reasons for my
> index corruption is deleting the index.write file rather than removing the
> lock through the Lucene APIs.  This seems like it could be a cause of
> corruption, correct?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message