lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: best way to ensure IndexWriter won't corrupt the index?
Date Thu, 26 Nov 2009 19:15:48 GMT
Right, in 2.4, if you kill -9, pull power, OS crashes, etc., it should
not corrupt the index.

Can you share details on what corruption you see?  Is it possible
there are two IndexWriters open on the index at once?


On Thu, Nov 26, 2009 at 2:08 PM, Khosro Asgharifard
<> wrote:
> Hello,
>>>Before 2.4 it was possible that a crash of the OS, or sudden power
>>>loss to the machine, could corrupt the index.  But that's been fixed
>>>with 2.4.
> Did you mean that Lucene does not have this issue in 2.4.1?
> We are running the program that index some data ,and sometime we must shutdown Tomcat,
> and in some case the index corrupt.This is a probelm in our program,and our data is too
huge and
> we can not run reindeing  all data agian.We use Lucene 2.4.0 with Compass.
> Best Regards
> Khosro.
>>>The only known sources of corruption are hardware faults (bad RAM, bad
>>>disk, etc.), and, accidentally allowing 2 writers to write to the same
>>>index at once (this will very quickly cause corruption).  Lucene's
>>>write lock normally prevents this from happening.
>>>kill -9, JRE crashing, OS crashing, power loss, etc., should not cause
>>>corruption for Lucene >= 2.4.
>>>Backing up is definitely a good idea -- Lucene's
>>>SnapshotDeletionPolicy makes it easy to do a hot backup (backup even
>>>though IndexWriter is still changing the index).  There's a paper on
>>>this (NOTE: I'm the author!) available  at
>>> that gives details (look for "Hot
>>>backups with Lucene (green paper - html)".
> Mike
> On Wed, Nov 25, 2009 at 10:37 AM, Max Lynch <> wrote:
>> On Wed, Nov 25, 2009 at 9:31 AM, Ian Lea <> wrote:
>>> > What are the typical scenarios when the index will go corrupt?
>>> Dodgy disks.
>> I also have had index corruption on two occasions.  It is not a big deal for
>> me since my data is fairly real time so the old documents aren't as
>> important.
>> However, I'm running this on a VPS with slicehost, so whether or not they
>> use dodgy disks is not something I can confirm or even deal with.
>> I do need to upgrade to 2.9 from 2.4, but I think one of the reasons for my
>> index corruption is deleting the index.write file rather than removing the
>> lock through the Lucene APIs.  This seems like it could be a cause of
>> corruption, correct?
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message