Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3CE27EFDC for ; Sun, 9 Dec 2012 04:57:58 +0000 (UTC) Received: (qmail 37924 invoked by uid 500); 9 Dec 2012 04:57:57 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 37897 invoked by uid 500); 9 Dec 2012 04:57:57 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 37883 invoked by uid 99); 9 Dec 2012 04:57:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 09 Dec 2012 04:57:57 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ericacm@gmail.com designates 209.85.212.42 as permitted sender) Received: from [209.85.212.42] (HELO mail-vb0-f42.google.com) (209.85.212.42) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 09 Dec 2012 04:57:51 +0000 Received: by mail-vb0-f42.google.com with SMTP id fa15so1341583vbb.15 for ; Sat, 08 Dec 2012 20:57:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=glhj6XPLNj3jU7CztuvkfdYrNFbZBXmcFqzDHHloH1w=; b=00aTROKr2gmrsh9q2sj8YOXL3Rt8RXY9jU+acyXoQQ+mOn+XcUMbnW/3hxfMwNIjKO X0dvkC739Jf1W0jZiehrzsUyXs3cE2V+67CfqwMzWq6vwAoLuI9nwvN8Rfm9LyE5R+9y neqME1QRtCct/r6FgHhpKOhpcsg3+92cWOZR51uHZZsBmlo9pIf63q7RNY8Xrtf7tgJ9 9WUdZ/0FckLE5VxN3VKJFqlTa8SOgbj3Ogm2iAuPp59It/V0KJnTaIEF9Y1ZCyha+G6/ qUgmhrQUsEE0gIMhPvxD00BKg9+DeNbblNjMlgw1TpFMpHOSTOvfu7x4gD6mTEKuGLcL 7t+A== Received: by 10.220.107.5 with SMTP id z5mr6694035vco.22.1355029050999; Sat, 08 Dec 2012 20:57:30 -0800 (PST) MIME-Version: 1.0 Received: by 10.221.10.140 with HTTP; Sat, 8 Dec 2012 20:56:50 -0800 (PST) In-Reply-To: References: <97FC3600-1752-46FD-B1E7-7FC052697D04@jordanzimmerman.com> <560030CC-ABE3-4E57-AC3F-56A4B8917F1A@jordanzimmerman.com> From: Eric Pederson Date: Sat, 8 Dec 2012 23:56:50 -0500 Message-ID: Subject: Re: leader election, scheduled tasks, losing leadership To: "user@zookeeper.apache.org" Content-Type: multipart/alternative; boundary=f46d043c7bd43ec56a04d0644937 X-Virus-Checked: Checked by ClamAV on apache.org --f46d043c7bd43ec56a04d0644937 Content-Type: text/plain; charset=ISO-8859-1 If I recall correctly it was Henry Robinson that gave me the advice to have a "task in progress" check. -- Eric On Sat, Dec 8, 2012 at 11:54 PM, Eric Pederson wrote: > I am using Curator LeaderLatch :) > > > -- Eric > > > > > On Sat, Dec 8, 2012 at 11:52 PM, Jordan Zimmerman < > jordan@jordanzimmerman.com> wrote: > >> You might check your leader implementation. Writing a correct leader >> recipe is actually quite challenging due to edge cases. Have a look at >> Curator (disclosure: I wrote it) for an example. >> >> -JZ >> >> On Dec 8, 2012, at 8:49 PM, Eric Pederson wrote: >> >> > Actually I had the same thought and didn't consider having to do this >> until >> > I talked about my project at a Zookeeper User Group a month or so ago >> and I >> > was given this advice. >> > >> > I know that I do see leadership being lost/transferred when one of the >> ZK >> > servers is restarted (not the whole ensemble). And it seems like I've >> > seen it happen even when the ensemble stays totally stable (though I am >> not >> > 100% sure as it's been a while since I have worked on this particular >> > application). >> > >> > >> > >> > -- Eric >> > >> > >> > >> > On Sat, Dec 8, 2012 at 11:25 PM, Jordan Zimmerman < >> > jordan@jordanzimmerman.com> wrote: >> > >> >> Why would it lose leadership? The only reason I can think of is if the >> ZK >> >> cluster goes down. In normal use, the ZK cluster won't go down (I >> assume >> >> you're running 3 or 5 instances). >> >> >> >> -JZ >> >> >> >> On Dec 8, 2012, at 8:17 PM, Eric Pederson wrote: >> >> >> >>> During the time the task is running a cluster member could lose its >> >>> leadership. >> >> >> >> >> >> > --f46d043c7bd43ec56a04d0644937--