lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley" <yo...@apache.org>
Subject Re: [jira] Commented: (LUCENE-665) temporary file access denied on Windows
Date Thu, 14 Sep 2006 14:01:45 GMT
On 9/14/06, Michael McCandless <lucene@mikemccandless.com> wrote:
> Yonik Seeley wrote:
> >> > >> But, I'm still renaming segments_N.new -> segments_N,
> >> > >
> >> > > Hmmm, remind me why you need the .new file?  Why can't you just
> >> create
> >> > > segments_N after you are finished writing all of the segments?
> >> >
> >> > Because there could be a reader that tries to read the file before it's
> >> > done being written.  It would hit EOF and throw an IOException.
> >>
> >> Ahh, right... unlikely (the segments file is pretty small), but possible.
> >>
> >> Another alternative (since this changes the index format anyway) is to
> >> put something in the segments file to detect if it's partially
> >> written... something like the size of the file or the number of
> >> segments.  I don't know if the extra complexity would be worth saving
> >> the creation time of an extra file or not...
> >
> > Hey wait... the segments file already has the number of segments.
> > Can't you tell if it's not yet complete?
>
> Good point!  A reader could easily know that's it's dealing with an
> unfinished segments file (since the file says how many segments it
> has) and then sleep/retry until the file completes, which should be a
> rare event.  Note that such contention in the current Lucene (ie, on
> the commit lock) results in a 1.0 second delay and then retry.
>
> Though what if the writer has crashed and so the new segments file
> will never be done?  I guess reader could fallback to the previous
> _(N-1) file after some time at the cost of more delay.

If it will happen so rarely, make it simpler and go directly for
segments_(N-1)... (treat it like your previous plan if segments_N.done
hadn't been written yet).

> I think that approach would work but I'm still worried about the
> interaction with filesystem caching.  EG how much latency is added by
> the caching before it realizes this file now has some more data?

Local filesystems don't have that problem.
Remote filesystems would hopefully check for new blocks on demand (as
you try to read it).


-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message