Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E1AF218D5F for ; Fri, 2 Oct 2015 00:11:21 +0000 (UTC) Received: (qmail 51663 invoked by uid 500); 2 Oct 2015 00:11:20 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 51605 invoked by uid 500); 2 Oct 2015 00:11:20 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 51593 invoked by uid 99); 2 Oct 2015 00:11:20 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Oct 2015 00:11:20 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id D1970C1436 for ; Fri, 2 Oct 2015 00:11:19 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.111 X-Spam-Level: X-Spam-Status: No, score=-0.111 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_HELO_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=elyograg.org Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id U7mptvX3Naqa for ; Fri, 2 Oct 2015 00:11:15 +0000 (UTC) Received: from frodo.elyograg.org (frodo.elyograg.org [166.70.79.219]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTP id 132A024C0C for ; Fri, 2 Oct 2015 00:11:14 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by frodo.elyograg.org (Postfix) with ESMTP id 678362F46 for ; Thu, 1 Oct 2015 18:11:12 -0600 (MDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=elyograg.org; h= content-transfer-encoding:content-type:content-type:mime-version :user-agent:date:date:message-id:subject:subject:from:from :received:received; s=mail; t=1443744672; bh=EjWtqVqBjC5FG64++QQ tk+78LLKwlE0RHAF7ug8jjwY=; b=Y0NrlfVtcZQPWCU4R2uYWsZyHVWDkJT8hV9 t5SYhgdjWQ4HD856TJ8LUTX8s4oIH33hmNpn0lrWZcnm9QmPHzxOQFLBmp0y3KgC qJMKAUhVR/2L291EzpRZjyBp4tWfENvy95ks6kV/THS5UDPLYx+TdNHnNRqoamQk Gr9HGb6k= X-Virus-Scanned: Debian amavisd-new at frodo.elyograg.org Received: from frodo.elyograg.org ([127.0.0.1]) by localhost (frodo.elyograg.org [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Y+l-Lc62nRQO for ; Thu, 1 Oct 2015 18:11:12 -0600 (MDT) Received: from [10.2.0.108] (client175.mainstreamdata.com [209.63.42.175]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: elyograg@elyograg.org) by frodo.elyograg.org (Postfix) with ESMTPSA id D00C9BAF for ; Thu, 1 Oct 2015 18:11:11 -0600 (MDT) To: user@zookeeper.apache.org From: Shawn Heisey Subject: Prevent a znode from exceeding jute.maxbuffer X-Enigmail-Draft-Status: N1110 Message-ID: <560DCB9E.4090307@elyograg.org> Date: Thu, 1 Oct 2015 18:11:10 -0600 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit I was going to open an issue in Jira for this, but I figured I should discuss it here before I do that, to make sure that's a reasonable course of action. I was thinking about a problem that we encounter with SolrCloud, where our overseer queue (stored in zookeeper) will greatly exceed the default jute.maxbuffer size. I encountered this personally while researching something for a Solr issue: https://issues.apache.org/jira/browse/SOLR-7191?focusedCommentId=14347834&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14347834 It seems silly that a znode could get to 14 times the allowed size without notifying the code *inserting* the data. The structure of our queue is such that entries in the queue are children of the znode. This means that the data stored directly in the znode is not the problem (which is pretty much nonexistent in this case), it's the number of children. It seems like it would be a good idea to reject the creation of new children if that would cause the znode size to exceed jute.maxbuffer. This moves the required error handling to the code that *updates* ZK, rather than the code that is watching and/or reading ZK, which seems more appropriate to me. Alternately, the mechanisms involved could be changed so that the client can handle accessing a znode with millions of children, without complaining about the packet length. Thoughts? Thanks, Shawn