Return-Path: Delivered-To: apmail-hadoop-zookeeper-user-archive@minotaur.apache.org Received: (qmail 81364 invoked from network); 12 Jan 2010 19:29:12 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 12 Jan 2010 19:29:12 -0000 Received: (qmail 50504 invoked by uid 500); 12 Jan 2010 19:29:11 -0000 Delivered-To: apmail-hadoop-zookeeper-user-archive@hadoop.apache.org Received: (qmail 50480 invoked by uid 500); 12 Jan 2010 19:29:11 -0000 Mailing-List: contact zookeeper-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: zookeeper-user@hadoop.apache.org Delivered-To: mailing list zookeeper-user@hadoop.apache.org Received: (qmail 50466 invoked by uid 99); 12 Jan 2010 19:29:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Jan 2010 19:29:11 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [69.147.107.20] (HELO mrout1-b.corp.re1.yahoo.com) (69.147.107.20) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Jan 2010 19:29:03 +0000 Received: from [10.73.135.245] (wifi-e-135-245.corp.yahoo.com [10.73.135.245]) by mrout1-b.corp.re1.yahoo.com (8.13.8/8.13.8/y.out) with ESMTP id o0CJSPUn080801; Tue, 12 Jan 2010 11:28:25 -0800 (PST) Message-ID: <4B4CCD59.6020002@apache.org> Date: Tue, 12 Jan 2010 11:28:25 -0800 From: Patrick Hunt User-Agent: Thunderbird 2.0.0.23 (X11/20090817) MIME-Version: 1.0 To: zookeeper-user@hadoop.apache.org Subject: Re: Recommendations for zookeeper deployment References: <4B4B64B3.4000102@apache.org> <2BC67CC22BD6B048A12DD9C4D48EFDD790B4B59CA7@LNWEXMBX0105.msad.ms.com> In-Reply-To: <2BC67CC22BD6B048A12DD9C4D48EFDD790B4B59CA7@LNWEXMBX0105.msad.ms.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Mekaraj, Prashant wrote: > Hi, > > http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html > is a great resource. It's rare to see a open source project think so > much about practical enterprise deployment and this is much > appreciated. > Thanks! > There are a few more recommendations that I think would be useful to > add to the page. > Feel free to open JIRAs when you encounter problems, feature suggestions, comments on docs, anything. If you submit patches as well it's even better. ;-) > 1. dataDir size: Since the dataDir stores snapshots and you recommend > storing at least 3 snapshots, I am thinking of using 3 times the size > of the heap allocated to the process as a guideline for how big the > dataDir drive should be. It needs to be significantly larger than that. 3x would be a lower bound, not an upper. Typically this is cleared by a cron script, so you aren't guaranteed that only 3 snaps reside in the dir at any one time. > 2. dataLogDir size: Since a new log file is started every time a > snapshot is taken, and using 3 snapshots as a recommendation, I am > thinking of using the same 3 times size of heap as a guideline. You can end up with more than a single log per snapshot, so again this is really a lower bound, not an upper. We've been reticent to pin a number/calc just because it's hard to calculate and can depend alot on the environment. Also given the size of disks these days it hasn't been much of an issue, at least for us, and I haven't heard much about it from others. It's a good point, I don't know how one would approach the calc - the primary components of the calculation are; 1) the frequency of writes to the ensemble, 2) heap size as you suggest, 3) the frequency of "cleanup" of the datadir. There are additional issues such as configuration parameters (changing the defaults) that would also need to be factored in. > 3. Persistence of data and log directories: > https://issues.apache.org/jira/browse/ZOOKEEPER-546 implies that > there are cases where all zk data is loaded from a different > configuration store. In such cases, even if I use a disk that is > cleaned regularly(on reboots or rebuilds), I would be fine. Yes, as long as you don't "rebuild" a majority the servers at the same time. :-) > Also - If a zk server were to be added to an existing ensemble- for > example when the machine reboots), if the data and datalog > directories are empty, it seems to me that the server would sync with > the leader and build its log and snapshots again, although there will > be a performance hit on the entire ensemble while this is taking > place. Is this correct ? Minimal performance hit really. The leader is streaming the latest snap/log to the new zk server. Not much cpu overhead, minimal IO (sequential read of the file), hopefully your network isn't maxed out, etc.... This is going on in parallel while the rest of the ensemble continues to process requests (as long as quorum has been maintained of course). Patrick > > -------------------------------------------------------------------------- > NOTICE: If received in error, please destroy, and notify sender. > Sender does not intend to waive confidentiality or privilege. Use of > this email is prohibited when received in error. We may monitor and > store emails to the extent permitted by applicable law.