Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C388010514 for ; Tue, 10 Sep 2013 21:05:08 +0000 (UTC) Received: (qmail 20687 invoked by uid 500); 10 Sep 2013 21:05:08 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 20648 invoked by uid 500); 10 Sep 2013 21:05:08 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 20640 invoked by uid 99); 10 Sep 2013 21:05:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Sep 2013 21:05:08 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ted.dunning@gmail.com designates 209.85.223.171 as permitted sender) Received: from [209.85.223.171] (HELO mail-ie0-f171.google.com) (209.85.223.171) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Sep 2013 21:05:01 +0000 Received: by mail-ie0-f171.google.com with SMTP id 16so6210781iea.2 for ; Tue, 10 Sep 2013 14:04:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=eT3f2PUUILFT6UyzxzyeclKUAfBPRN5iP2MmrAURnXQ=; b=iWpq4Ka0KrsYVjQttMULhhjgIuVgzSa0YDRoD8kX9Tuho3zIIwR/W2OKSqpEIgf3w6 ETQYXF1S9Ly+uGj1LmCN0HBzx4VPSwD7eYwCFBMHIzONERSbquXHGwEL5+aiINRcj6Wj qsULLBnVPo9XnexYK2L1q6OFO0haytB9+JoMynnixo9AlQhsunNlkil8O7XvHbTqYr1v NA/B0UDzJLFUE25YuwzWVpvZLVHdAwxEnkDwuOZw9p1QzRy/OEp3ampVhl+nFxvmfExb WpLRaJW6rN0Wnc8tauA7tmokj0Sh3lF7EeA9Qo9NAKBBI97YYWHPDdjkIjfLXjZDiGGB ULaQ== X-Received: by 10.50.62.211 with SMTP id a19mr11414822igs.18.1378847080669; Tue, 10 Sep 2013 14:04:40 -0700 (PDT) MIME-Version: 1.0 Received: by 10.64.224.145 with HTTP; Tue, 10 Sep 2013 14:04:10 -0700 (PDT) In-Reply-To: References: <522F7A9D.20800@nicira.com> <522F8264.5090606@nicira.com> From: Ted Dunning Date: Tue, 10 Sep 2013 14:04:10 -0700 Message-ID: Subject: Re: adding a separate thread to detect network timeouts faster To: Jeremy Stribling Cc: "user@zookeeper.apache.org" Content-Type: multipart/alternative; boundary=047d7bd7592071697204e60ddadf X-Virus-Checked: Checked by ClamAV on apache.org --047d7bd7592071697204e60ddadf Content-Type: text/plain; charset=UTF-8 Perhaps you should be suggesting a design that is adaptive rather than configured and guarantees low overhead at the cost of notification time in extreme scenarios. For instance, the server can send no more than 1000 (or whatever number) HB's per second and never more than one per second to any client. This caps the cost nicely. On Tue, Sep 10, 2013 at 1:59 PM, Ted Dunning wrote: > > Since you are talking about client connection failure detection, no, I > don't think that there is a major barrier other than actually implementing > a reliable check. > > Keep in mind the cost. There are ZK installs with 100,000 clients. If > these are heartbeating every 2 seconds, you have 50,000 packets per second > hitting the quorum or 10,000 per server if all connections are well > balanced. > > If you only have 10 clients, the network burden is nominal. > > > > On Tue, Sep 10, 2013 at 1:34 PM, Jeremy Stribling wrote: > >> I mostly agree, but let's assume that a ~5x speedup in detecting those >> types of failures is considered significant for some people. Are there >> technical reasons that would prevent this idea from working? >> >> On 09/10/2013 01:31 PM, Ted Dunning wrote: >> >>> I don't see the strong value here. A few failures would be detected more >>> quickly, but I am not convinced that this would actually improve >>> functionality significantly. >>> >>> >>> On Tue, Sep 10, 2013 at 1:01 PM, Jeremy Stribling >>> wrote: >>> >>> Hi all, >>>> >>>> Let's assume that you wanted to deploy ZK in a virtualized environment, >>>> despite all of the known drawbacks. Assume we could deploy it such that >>>> the ZK servers were all using independent CPUs and storage (though not >>>> dedicated disks). Obviously, the shared disks (shared with other, >>>> non-ZK >>>> VMs on the same hypervisor) will cause ZK to hit the default session >>>> timeout occasionally, so you would need to raise the existing session >>>> timeout to something like 30 seconds. >>>> >>>> I'm curious if there would be any technical drawbacks to adding an >>>> additional heartbeat mechanism between the clients and the servers, >>>> which >>>> would have the goal of detecting network-only failures faster than the >>>> existing heartbeat mechanism. The idea is that there would be a new >>>> thread >>>> dedicated to processing these heartbeats, which would not get blocked on >>>> I/O. Then the clients could configure a second, smaller timeout value, >>>> and >>>> it would be assumed that any such timeout indicated a real problem. The >>>> existing mechanism would still be in place to catch I/O-related errors. >>>> >>>> I understand the philosophy that there should be some heartbeat >>>> mechanism >>>> that takes the disk into account, but I'm having trouble coming up with >>>> technical reasons not to add a second mechanism. Obviously, the >>>> advantage >>>> would be that the clients could detect network failures and system >>>> crashes >>>> more quickly in an environment with slow disks, and fail over to other >>>> servers more quickly. The only disadvantages I can come up with are: >>>> >>>> 1) More code complexity, and slightly more heartbeat traffic on the wire >>>> 2) I think the servers have to log session expirations to disk, so if >>>> the >>>> sessions expire at a faster rate than the disk can handle, it might >>>> lead to >>>> a large backlog. >>>> >>>> Are there other drawbacks I am missing? Would a patch that added >>>> something like this be considered, or is it dead from the start? Thanks, >>>> >>>> Jeremy >>>> >>>> >>>> >> > --047d7bd7592071697204e60ddadf--