incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject fsync
Date Sun, 20 Dec 2009 05:14:20 GMT
On Sat, Dec 19, 2009 at 12:36:37PM -0800, Nathan Kurz wrote:

> I think your approach here makes great sense:  you can't prevent data
> corruption, you just want to reduce the chance of it happening to an
> acceptable level.  

Do you have any objections to the compromise plan we arrived at?

    /** Permanently commit all changes made to the index.  
     * Until either Commit() or Flush() completes, read processes cannot see
     * any of the current changes.
    public void
    Commit(Indexer *self);

    /** A lightweight and technically unsafe variant of Commit() which skips
     * calling fsync() to force the OS to write all changes to disk.  It
     * returns faster than Commit(), but leaves the index vulnerable to
     * corruption in the event of a power failure or OS crash.
    public void
    Flush(IndexWriter *self);

    /** Perform the expensive setup for Commit() (or Flush()) in advance, so
     * that Commit() completes quickly.  (If Prepare_Commit() is not called
     * explicitly by the user, Commit() will call it internally.)
     * @param fsync If true, fsync() all changed files.
    public void
    Prepare_Commit(Indexer *self, bool_t fsync);

    /* Restore the index to the state of the last successful Commit(),
     * discarding any changes which may have been Flushed() since then.
    public void
    Rollback(Indexer *self);
> Thinking about how you could add an external log file seems like a better
> failsafe than trying to do full 'commits' within the writer, seeing as there
> is no guarantee those commits will actually hit disk.

I think the addition of Flush() and Rollback() will make it easier for top
level apps to manage a transaction log.

Mike's point about "unbounded" corruption is a good one.  Old segment data
which gets merged away can suddenly get destroyed if fsync'ing the new segment
is incomplete when the power goes out.

We guard against that by always keeping around the last snapshot which went
through the complete fsync dance.  Essentially, we'll need to keep around two
snapshots: the last flushed snapshot, and the last committed snapshot.  

By "snapshot", I mean both the snapshot file itself and all the files it

It's true that we can't guarantee that the hard drive obeys the fsync call,
but I do think that offering an fsync'd Commit() as an option is a good idea
-- so long as we also offer the "atomic but not durable" option for those who
know what they're doing.

> I also think that Mike is making too much distinction between "relying
> on the file system" and "using shared memory".  I think one can safely
> view them as two interfaces to the same underlying mechanism.

I agree with that, and it was kind of confusing since Mike had previously
seemed to suggest that the flush() semantics were a "Lucy-ification" of the
Lucene model.  See the first section of my latest reply to him:

Marvin Humphrey

View raw message