Return-Path: Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: (qmail 92776 invoked from network); 24 Mar 2011 23:33:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Mar 2011 23:33:20 -0000 Received: (qmail 45561 invoked by uid 500); 24 Mar 2011 23:33:20 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 45501 invoked by uid 500); 24 Mar 2011 23:33:20 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 45493 invoked by uid 500); 24 Mar 2011 23:33:20 -0000 Delivered-To: apmail-hadoop-zookeeper-user@hadoop.apache.org Received: (qmail 45490 invoked by uid 99); 24 Mar 2011 23:33:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Mar 2011 23:33:20 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW X-Spam-Check-By: apache.org Received-SPF: unknown (nike.apache.org: error in processing during lookup of strib@nicira.com) Received: from [209.85.213.48] (HELO mail-yw0-f48.google.com) (209.85.213.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Mar 2011 23:33:13 +0000 Received: by ywo32 with SMTP id 32so303904ywo.35 for ; Thu, 24 Mar 2011 16:32:51 -0700 (PDT) Received: by 10.91.159.27 with SMTP id l27mr115223ago.177.1301009571338; Thu, 24 Mar 2011 16:32:51 -0700 (PDT) Received: from [172.16.0.50] ([66.201.54.10]) by mx.google.com with ESMTPS id d14sm432800ana.0.2011.03.24.16.32.50 (version=SSLv3 cipher=OTHER); Thu, 24 Mar 2011 16:32:50 -0700 (PDT) Message-ID: <4D8BD4A1.40507@nicira.com> Date: Thu, 24 Mar 2011 16:32:49 -0700 From: Jeremy Stribling User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.10) Gecko/20100619 Icedove/3.0.5 MIME-Version: 1.0 To: zookeeper-user Subject: watcher semantics for session events in the C client Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org I'm using Zookeeper 3.3.3 and the multi-threaded C client, and ran into an issue where the same watch callback is being triggered twice, on two separate session events. Here are the C client logs: 2011-03-23 17:25:10,180:10715(0x7f727b4cf710):ZOO_ERROR@handle_socket_error_msg@1603: Socket [10.0.0.3:2888] zk retcode=-4, errno=112(Host is down): failed while receiving a server response 2011-03-23 17:25:10,180:10715(0x7f727b4cf710):ZOO_DEBUG@handle_error@1141: Calling a watcher for a ZOO_SESSION_EVENT and the state=CONNECTING_STATE 2011-03-23 17:25:10,180:10715(0x7f727b4cf710):ZOO_INFO@check_events@1585: initiated connection to server [10.0.0.4:2888] 2011-03-23 17:25:10,181:10715(0x7f727b4cf710):ZOO_INFO@check_events@1632: session establishment complete on server [10.0.0.4:2888], sessionId=0xff2ee53bb56e0000, negotiated timeout=6000 2011-03-23 17:25:10,181:10715(0x7f727b4cf710):ZOO_DEBUG@send_set_watches@1312: Sending set watches request to 10.0.0.4:2888 2011-03-23 17:25:10,181:10715(0x7f727b4cf710):ZOO_DEBUG@send_auth_info@1248: Sending all auth info request to 10.0.0.4:2888 2011-03-23 17:25:10,181:10715(0x7f727b4cf710):ZOO_DEBUG@check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the state=ZOO_CONNECTED_STATE 2011-03-23 17:25:10,181:10715(0x7f727acce710):ZOO_DEBUG@process_completions@1765: Calling a watcher for node [], type = -1 event=ZOO_SESSION_EVENT 2011-03-23 17:25:10,181:10715(0x7f727acce710):ZOO_DEBUG@process_completions@1765: Calling a watcher for node [], type = -1 event=ZOO_SESSION_EVENT You can see that one socket went down, it sent out a CONNECTING state change event, and then another socket came up, causing a CONNECTED state change event. For each event, it looks like collectWatchers() just grabs all the watchers from the hashtable and queues them up in a list with the event. Unlike with other non-session-related events, it looks like the watchers get left in the hashtable (collect_session_watchers() doesn't call hashtable_remove() like add_for_event() does), so both of these events have the same exact list of watchers in them. Looking at the documentation, it seems a bit ambiguous as to whether session events count toward the fire-exactly-one-time nature of watch events: > When a client connects to a new server, the watch will be triggered for any session events. Watches will not be received while disconnected from a server. When a client reconnects, any previously registered watches will be reregistered and triggered if needed. ... > Watches are one time triggers; if you get a watch event and you want to get notified of future changes, you must set another watch. ... > When you disconnect from a server (for example, when the server fails), you will not get any watches until the connection is reestablished. For this reason session events are sent to all outstanding watch handlers. Use session events to go into a safe mode: you will not be receiving events while disconnected, so your process should act conservatively in that mode. Should I expect to have a watch callback triggered more than once, if there are session events involved? Thanks, Jeremy