lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <>
Subject Re: SMB2 cache
Date Thu, 13 Aug 2009 22:03:55 GMT
Also Mike - even if the writer has committed, and then I notify the other
nodes they should refresh, it's still possible for them to hit this
exception, right?

On Fri, Aug 14, 2009 at 1:02 AM, Shai Erera <> wrote:

> How can the writer delete all previous segments? If I have a reader open,
> doesn't it prevent those files to be deleted? That's why I count on any of
> those files to exist. Perhaps I'm wrong though.
> I think we can come up w/ some notification mechanism, through MQ or
> something.
> Do you think it's worth to be documented on the Wiki? The entry about FNFE
> during searches mentions NFS or SMB, but does not mention
> SimpleFSLockFactory (Which solves a different problem). Maybe we can add
> that info there?
> Shai
> On Fri, Aug 14, 2009 at 12:50 AM, Michael McCandless <
>> wrote:
>> On Thu, Aug 13, 2009 at 5:33 PM, Shai Erera<> wrote:
>> > So if afterwards we read until segment_17 and exhaust read-ahead, and we
>> > determine that there's a problem - we throw the exception. If instead
>> we'll
>> > try to read backwards, I'm sure one of the segments will be read
>> > successfully, because that reader must already see any segment, right?
>> I don't think you're guaranteed to read successfully, on reading
>> backwards.
>> Ie, say writer has committed segments_8, and therefore just removed
>> segments_7.
>> When the reader (on a different machine, w/ stale cache) tries to
>> open, it's cache claims segments_7 still exists, so we try to open
>> that but fail.  We advance to segments_8 and try to open that, but
>> fail (presumably because local SMB2 cache doesn't consult the server,
>> unlike many NFS clients, I think).  We then try up through segments_17
>> and nothing works.  But going backwards can't work either because
>> those segments files have all been deleted.  (Assuming
>> KeepOnlyLastCommitDeletionPolicy... things do get more interesting if
>> you're using a different deletion policy...).
>> Sadly, the most common approach to refreshing readers, eg checking
>> every N seconds if it's time to reopen, leads directly to this "cache
>> is holding onto stale data".  My guess is if an app only attempted to
>> reopen the reader after the writer on another machine had committed,
>> then this exception wouldn't happen.  But that'd require some
>> notification mechanism outside of Lucene.
>> Mike
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:

View raw message