lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Goetz <br...@quiotix.com>
Subject Re: Making Lucene Transactional
Date Fri, 28 Jun 2002 14:45:21 GMT
> That's interesting.  So it would be a very small change to add transactional
> (and even 2-phase commit) capabilities to the writer?  What about deletes?
> Since they use the reader, would it still be possible to allow a 2-phase
> commit/abort on that?

I think you're not using "transactional" in the same sense as Doug is.

Very few file systems are transactional, although some offer a small
number of atomic operations, such as rename.  This doesn't make them
transactional, but it allows application writers (that's us) to write
apps that are _less likely_ to be victimized by system failure.  But
Lucene still writes blocks to disk via the file system, without a
transaction log, and since disk drivers do things like defer or
reorder disk writes, we could still lose if the system crashed at the
wrong time.  Still, we do a lot to reduce this risk beyond that of
most file-based applications.

> I would very much like to have a 2-phase commit in Lucene in order to ensure
> that it is always in sync with my database.  I always thought that I'd end
> up having to write custom code to store the Lucene index in the database,
> but maybe that wouldn't be necessary...?

Two phase commit is a whole different beast; this involves
coordinating multiple transactional resource managers (which Lucene
isn't) with a separate transaction monitor, using a protocol such as
XA or OTS.  We're nowhere near that.  

Storing the index in a database would be a good start, although the
Directory interface is really derived with the assumptions of a file
system.  Still, that would not get us all the way there -- you'd need
to introduce transaction demarcation methods into the Lucene API, so
that these could be passed to the DBDirectory, so we would know what
groups of updates should be considered atomic.  

And that still doesn't get us close to 2PC; we'd still have to support
XA for that, and I don't see any good reason to undertake that level
of effort at this point.  

However, I think revisiting Directory with an eye towards making it
something that can be efficiently implemented on either a DB or a file
system would be worthwhile.  

> > -----Original Message-----
> > From: Doug Cutting [mailto:cutting@lucene.com]
> > Sent: Thursday, June 27, 2002 10:36 AM
> > To: Lucene Users List
> > Subject: Re: Stress Testing Lucene
> > 
> > 
> > It's very hard to leave an index in a bad state.  Updating the 
> > "segments" file atomically updates the index.  So the only way to 
> > corrupt things is to only partly update the segments file.  
> > But that too 
> > is hard, since it's first written to a temporary file, which is then 
> > renamed "segments".  The only vulnerability I know if is that 
> > in Java on 
> > Win32 you can't atomically rename a file to something that already 
> > exists, so Lucene has to first remove the old version.  So if 
> > you were 
> > to crash between the time that the old version of "segments" 
> > is removed 
> > and the new version is moved into place, then the index would be 
> > corrupt, because it would have no "segments" file.
> > 
> > Doug

--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message