Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 43856 invoked from network); 13 Aug 2009 22:04:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Aug 2009 22:04:19 -0000 Received: (qmail 87341 invoked by uid 500); 13 Aug 2009 22:04:25 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 87264 invoked by uid 500); 13 Aug 2009 22:04:25 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 87256 invoked by uid 99); 13 Aug 2009 22:04:25 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Aug 2009 22:04:25 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of serera@gmail.com designates 209.85.219.226 as permitted sender) Received: from [209.85.219.226] (HELO mail-ew0-f226.google.com) (209.85.219.226) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Aug 2009 22:04:15 +0000 Received: by ewy26 with SMTP id 26so1169448ewy.5 for ; Thu, 13 Aug 2009 15:03:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=zKKfSclYw49eoyKrTjIzXDIYDM++5pEskZww1hmT36I=; b=vRLFKB+xd1YI09xmcUtJ9r5PX9e348RJz6OaEKpTZXBslJEL7T6vJMSscJdXEZ1Uiw q1NBe8lI9DSOxHvTh9+/yLE67Lly8SAksJ1GAt4yTPNy5qYsPyhgLAaUe6c5qHuixbQJ Oscw9OC8Y/g6KZJIzpjxTYtV3BiC4P7CCBuq0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=KDIQcdlo1t2SwjnOizVPjTVpxVEBZCCxjswznKi4mK41/Y4+tEVa7NXASvh5NFw4UK zrFS0cN6qp7nCbicx6R1zH74mB9sdjVr4FiatoA5SjEWznit/gDjo4+2TuXBJXq18y0V Sj4rLoEgU1E/UbboBTf/Xq/gC9dPP1kkVZvik= MIME-Version: 1.0 Received: by 10.216.26.200 with SMTP id c50mr266081wea.61.1250201035307; Thu, 13 Aug 2009 15:03:55 -0700 (PDT) In-Reply-To: <786fde50908131502q25fb5b81j5ab979cea68c762e@mail.gmail.com> References: <786fde50908130550j60a2d70ax3ed24fef2df050b4@mail.gmail.com> <9ac0c6aa0908131129j115bc9ack8785ca6f36e027e3@mail.gmail.com> <786fde50908131402v122e8eaej624fe19c882fea1c@mail.gmail.com> <9ac0c6aa0908131426k47552716j22eb5752d8b4dbcf@mail.gmail.com> <786fde50908131433o4c28db99l78f6145c7dc8e59f@mail.gmail.com> <9ac0c6aa0908131450h594c3eb2x35636776243679e2@mail.gmail.com> <786fde50908131502q25fb5b81j5ab979cea68c762e@mail.gmail.com> Date: Fri, 14 Aug 2009 01:03:55 +0300 Message-ID: <786fde50908131503x162dd9e1gd73a503dedf8e4ad@mail.gmail.com> Subject: Re: SMB2 cache From: Shai Erera To: java-dev@lucene.apache.org Content-Type: multipart/alternative; boundary=001636d348a09b7d0204710d1c83 X-Virus-Checked: Checked by ClamAV on apache.org --001636d348a09b7d0204710d1c83 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Also Mike - even if the writer has committed, and then I notify the other nodes they should refresh, it's still possible for them to hit this exception, right? On Fri, Aug 14, 2009 at 1:02 AM, Shai Erera wrote: > How can the writer delete all previous segments? If I have a reader open, > doesn't it prevent those files to be deleted? That's why I count on any of > those files to exist. Perhaps I'm wrong though. > > I think we can come up w/ some notification mechanism, through MQ or > something. > > Do you think it's worth to be documented on the Wiki? The entry about FNFE > during searches mentions NFS or SMB, but does not mention > SimpleFSLockFactory (Which solves a different problem). Maybe we can add > that info there? > > Shai > > > On Fri, Aug 14, 2009 at 12:50 AM, Michael McCandless < > lucene@mikemccandless.com> wrote: > >> On Thu, Aug 13, 2009 at 5:33 PM, Shai Erera wrote: >> >> > So if afterwards we read until segment_17 and exhaust read-ahead, and we >> > determine that there's a problem - we throw the exception. If instead >> we'll >> > try to read backwards, I'm sure one of the segments will be read >> > successfully, because that reader must already see any segment, right? >> >> I don't think you're guaranteed to read successfully, on reading >> backwards. >> >> Ie, say writer has committed segments_8, and therefore just removed >> segments_7. >> >> When the reader (on a different machine, w/ stale cache) tries to >> open, it's cache claims segments_7 still exists, so we try to open >> that but fail. We advance to segments_8 and try to open that, but >> fail (presumably because local SMB2 cache doesn't consult the server, >> unlike many NFS clients, I think). We then try up through segments_17 >> and nothing works. But going backwards can't work either because >> those segments files have all been deleted. (Assuming >> KeepOnlyLastCommitDeletionPolicy... things do get more interesting if >> you're using a different deletion policy...). >> >> Sadly, the most common approach to refreshing readers, eg checking >> every N seconds if it's time to reopen, leads directly to this "cache >> is holding onto stale data". My guess is if an app only attempted to >> reopen the reader after the writer on another machine had committed, >> then this exception wouldn't happen. But that'd require some >> notification mechanism outside of Lucene. >> >> Mike >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-dev-help@lucene.apache.org >> >> > --001636d348a09b7d0204710d1c83 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Also Mike - even if the writer has committed, and then I n= otify the other nodes they should refresh, it's still possible for them= to hit this exception, right?

On Fri, Au= g 14, 2009 at 1:02 AM, Shai Erera <serera@gmail.com> wrote:
= How can the writer delete all previous segments? If I have a reader open, d= oesn't it prevent those files to be deleted? That's why I count on = any of those files to exist. Perhaps I'm wrong though.

I think we can come up w/ some notification mechanism, through MQ or so= mething.

Do you think it's worth to be documented on the Wiki? T= he entry about FNFE during searches mentions NFS or SMB, but does not menti= on SimpleFSLockFactory (Which solves a different problem). Maybe we can add= that info there?

Shai


On Fri, Aug 14, 2009 at 12:50 AM, Michael McCandless <luc= ene@mikemccandless.com> wrote:
On Thu, Aug 13, 2009 at 5:33 PM, Shai Erera<serera@gmail.com> wrote:

> So if afterwards we read until segment_17 and exhaust read-ahead, and = we
> determine that there's a problem - we throw the exception. If inst= ead we'll
> try to read backwards, I'm sure one of the segments will be read > successfully, because that reader must already see any segment, right?=

I don't think you're guaranteed to read successfully, on read= ing backwards.

Ie, say writer has committed segments_8, and therefore just removed segment= s_7.

When the reader (on a different machine, w/ stale cache) tries to
open, it's cache claims segments_7 still exists, so we try to open
that but fail. =A0We advance to segments_8 and try to open that, but
fail (presumably because local SMB2 cache doesn't consult the server, unlike many NFS clients, I think). =A0We then try up through segments_17 and nothing works. =A0But going backwards can't work either because
those segments files have all been deleted. =A0(Assuming
KeepOnlyLastCommitDeletionPolicy... things do get more interesting if
you're using a different deletion policy...).

Sadly, the most common approach to refreshing readers, eg checking
every N seconds if it's time to reopen, leads directly to this "ca= che
is holding onto stale data". =A0My guess is if an app only attempted t= o
reopen the reader after the writer on another machine had committed,
then this exception wouldn't happen. =A0But that'd require some
notification mechanism outside of Lucene.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org



--001636d348a09b7d0204710d1c83--