lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <>
Subject [jira] Commented: (LUCENE-1044) Behavior on hard power shutdown
Date Sat, 03 Nov 2007 05:24:50 GMT


Hoss Man commented on LUCENE-1044:

first off: there have been *numerous* changes to the way lucene writes to files (particularly
relating to segment files, write locks, and fault tollerance) between 2.0 and 2.2 (not to
mention differences between 1.4.3 and 2.0 that i may not be aware of) -- so you may see many
differences in behavior if you upgrade.

second: to quote myself from a recent thread regarding lucene and "kill -9" ...

: That said, it should never in fact cause index corruption, as far as I
: know.  Lucene is "semi-transactional": at any & all moments you should
: be able to destroy the JVM and the index will be unharmed. I would
: really like to get to the bottom of why this is not the case here.

At any point you can shutdown the JVM and the index will be unharmed, but
"destroying" it with "kill -9" goes a little farther then that.  

Lucene can't make that claim because the JVM can't even garuntee that
bytes are written to physical disk when we close() an OutputStream -- all
it garuntees is that the bytes have been handed to the OS.  When you "kill
-9" a process the OS is free to make *EVERYTHING* about that process
vanish without cleaning up after it ... i'm pretty sure even pending IO
operations are fair game for disappearing.

...what's true for "kill -9" is true for hanking the power cord ... if the JVM isn't shut
down cleanly, there is nothing Lucene or the JVM can do to guarantee that your index is in
a consistent state.

> Behavior on hard power shutdown
> -------------------------------
>                 Key: LUCENE-1044
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>         Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java 1.5
>            Reporter: venkat rangan
> When indexing a large number of documents, upon a hard power failure  (e.g. pull the
power cord), the index seems to get corrupted. We start a Java application as an Windows Service,
and feed it documents. In some cases (after an index size of 1.7GB, with 30-40 index segment
.cfs files) , the following is observed.
> The 'segments' file contains only zeros. Its size is 265 bytes - all bytes are zeros.
> The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes are zeros.
> Before corruption, the segments file and deleted file appear to be correct. After this
corruption, the index is corrupted and lost.
> This is a problem observed in Lucene 1.4.3. We are not able to upgrade our customer deployments
to 1.9 or later version, but would be happy to back-port a patch, if the patch is small enough
and if this problem is already solved.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message