lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless" <luc...@mikemccandless.com>
Subject Re: [jira] Commented: (LUCENE-1044) Behavior on hard power shutdown
Date Mon, 12 Nov 2007 20:38:58 GMT
"robert engels" <rengels@ix.netcom.com> wrote:
> I would be wary of the additional complexity of doing this.
> 
> It would be my vote to making 'sync' an option, and if set, all files  
> are sync'd before close.

This is the way it is now: doSync is an option to FSDirectory,
which defaults to true.

I agree sync() before close() is by far the simplest approach here.

On a good IO it seems to have minimal performance impact.  On poor
hardware (laptop hard drive) I'm seeing a rather sizable impact
(~30-40% slowdown on indexing Wikipedia).

But I think given this I would still leave the default at true: I
think keeping index consistent, even on the somewhat rare event of
machine/OS crash, trumps indexing performance, as a default?  People
who care about performance are happy to change the defaults.

> With proper hardware setup, this should be a minimal performance  
> penalty.

Right.

> What about writing a marker at the end of each file? I am not sure it  
> is guarenteed but the segments is syncd, and the segment files have  
> the correct marker, then the segment file is ok. Otherwise the "bad"  
> segments/versions can be removed (on start up).

Well ... if we took this approach we would also have to forcefully
keep around the "last known good" commit point, vs what we do now
(delete all but the last commit point).  But, creating such a deletion
policy is not really possible because we can't "query" the IO system
(OS) to find out what's really on stable storage.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message