lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: best way to ensure IndexWriter won't corrupt the index?
Date Wed, 25 Nov 2009 17:18:12 GMT
Why do you want to kill your indexer anyway? Just because it had
been running "too long"? Or was it behaving poorly?

But yeah, you need to change your process, you're almost guaranteeing
that you'll corrupt your index. Perhaps, if you really need to stop and
restart you could have your indexer voluntarily stop after 8 hours. That
would allow you to close your indexwriter, thus insuring that all your
documents in memory were written to the index. Killing the process
from outside does NOT guarantee this. Your index won't be corrupt, but
it also won't have all your documents.

Best
Erick

On Wed, Nov 25, 2009 at 11:10 AM, Max Lynch <ihasmax@gmail.com> wrote:

> On Wed, Nov 25, 2009 at 9:49 AM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
> > Before 2.4 it was possible that a crash of the OS, or sudden power
> > loss to the machine, could corrupt the index.  But that's been fixed
> > with 2.4.
> >
> > The only known sources of corruption are hardware faults (bad RAM, bad
> > disk, etc.), and, accidentally allowing 2 writers to write to the same
> > index at once (this will very quickly cause corruption).  Lucene's
> > write lock normally prevents this from happening.
> >
>
> So in my case I have an indexer running in the background, and if it had
> been running for more than 8 hours, I would remove the write lock and start
> another indexer, which would cause corruption if the first one was still
> writing.  It's a bad process I'm using and I know it's not how Lucene
> usually does things so I need to improve my system.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message