Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8B024DC8B for ; Tue, 24 Jul 2012 13:09:16 +0000 (UTC) Received: (qmail 25661 invoked by uid 500); 24 Jul 2012 13:09:16 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 25596 invoked by uid 500); 24 Jul 2012 13:09:15 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 25588 invoked by uid 500); 24 Jul 2012 13:09:15 -0000 Delivered-To: apmail-hadoop-zookeeper-user@hadoop.apache.org Received: (qmail 25578 invoked by uid 99); 24 Jul 2012 13:09:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Jul 2012 13:09:15 +0000 X-ASF-Spam-Status: No, hits=2.0 required=5.0 tests=FSL_RCVD_USER,SPF_NEUTRAL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [216.139.236.26] (HELO sam.nabble.com) (216.139.236.26) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Jul 2012 13:09:10 +0000 Received: from jim.nabble.com ([192.168.236.80]) by sam.nabble.com with esmtp (Exim 4.72) (envelope-from ) id 1Ster2-0004dP-KP for zookeeper-user@hadoop.apache.org; Tue, 24 Jul 2012 06:08:48 -0700 Date: Tue, 24 Jul 2012 06:08:48 -0700 (PDT) From: Jack Luo To: zookeeper-user@hadoop.apache.org Message-ID: <1343135328614-7577729.post@n2.nabble.com> Subject: Data change notification is lost during failover MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hi All, I am using Zookeeper3.3.5 for a distributed project. During the test, a watch related issue is found. Our monitor program places 100 watches on 100 different paths (e.g. /goo1 =E2=80=A6. /goo100) for monitoring the data cha= nge, and another writer program updates one of paths at a specified interval. We found sometimes some data change notification messages are lost when the monitor program is switched to a new server due to the failure of current server.=20 I check the =E2=80=9Cwatch management=E2=80=9D section in current release n= otes http://zookeeper.apache.org/doc/trunk/releasenotes.html and find a statemen= t =E2=80=9CIn this release the client library tracks watches that a client ha= s registered and reregisters the watches when a connection is made to a new server.=E2=80=9D So based on the information, look like during server failo= ver it is expected behavior to lose data change notifications before watches are successfully re-registered in a new server. The solution that I figure out to this issue is to query all 100 paths to check if there is any data change after the monitor program is connected to a new server. However if we need to monitor 1000 or 10K paths, this solution may not be good. Can anyone suggest a better solution to this issue? Furthermore, can ZK service is enhanced to replicate the watches on each ZK server to solve this issue forever? Thanks for your time and help! Jack -- View this message in context: http://zookeeper-user.578899.n2.nabble.com/Da= ta-change-notification-is-lost-during-failover-tp7577729.html Sent from the zookeeper-user mailing list archive at Nabble.com.