Mailing-List: contact zookeeper-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: zookeeper-user@hadoop.apache.org
Received-SPF: neutral (nike.apache.org: local policy)
Message-ID: <4B4CCD59.6020002@apache.org>
Date: Tue, 12 Jan 2010 11:28:25 -0800
From: Patrick Hunt <phunt@apache.org>
User-Agent: Thunderbird 2.0.0.23 (X11/20090817)
MIME-Version: 1.0
To: zookeeper-user@hadoop.apache.org
Subject: Re: Recommendations for zookeeper deployment
References: <cf67d0ac1001091514m56291faeh2270fb626783f3ee@mail.gmail.com>
 <4B4B64B3.4000102@apache.org>
 <cf67d0ac1001111417s4658d349u30853cf91516382f@mail.gmail.com>
 <2BC67CC22BD6B048A12DD9C4D48EFDD790B4B59CA7@LNWEXMBX0105.msad.ms.com>
In-Reply-To: 
 <2BC67CC22BD6B048A12DD9C4D48EFDD790B4B59CA7@LNWEXMBX0105.msad.ms.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit


Mekaraj, Prashant wrote:
> Hi,
> 
> http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html
> is a great resource. It's rare to see a open source project think so
> much about practical enterprise deployment and this is much
> appreciated.
> 

Thanks!

> There are a few more recommendations that I think would be useful to
> add to the page.
> 

Feel free to open JIRAs when you encounter problems, feature 
suggestions, comments on docs, anything. If you submit patches as well 
it's even better. ;-)

> 1. dataDir size: Since the dataDir stores snapshots and you recommend
> storing at least 3 snapshots, I am thinking of using 3 times the size
> of the heap allocated to the process as a guideline for how big the
> dataDir drive should be.

It needs to be significantly larger than that. 3x would be a lower 
bound, not an upper. Typically this is cleared by a cron script, so you 
aren't guaranteed that only 3 snaps reside in the dir at any one time.

> 2. dataLogDir size: Since a new log file is started every time a
> snapshot is taken, and using 3 snapshots as a recommendation, I am
> thinking of using the same 3 times size of heap as a guideline.

You can end up with more than a single log per snapshot, so again this 
is really a lower bound, not an upper.

We've been reticent to pin a number/calc just because it's hard to 
calculate and can depend alot on the environment. Also given the size of 
disks these days it hasn't been much of an issue, at least for us, and I 
haven't heard much about it from others. It's a good point, I don't know 
how one would approach the calc - the primary components of the 
calculation are; 1) the frequency of writes to the ensemble, 2) heap 
size as you suggest, 3) the frequency of "cleanup" of the datadir. There 
are additional issues such as configuration parameters (changing the 
defaults) that would also need to be factored in.

> 3. Persistence of data and log directories:
> https://issues.apache.org/jira/browse/ZOOKEEPER-546 implies that
> there are cases where all zk data is  loaded from a different
> configuration store. In such cases, even if I use a disk that is
> cleaned regularly(on reboots or rebuilds), I would be fine.

Yes, as long as you don't "rebuild" a majority the servers at the same 
time. :-)

> Also - If a zk server were to be added to an existing ensemble- for
> example when the machine reboots), if the data and datalog
> directories are empty, it seems to me that the server would sync with
> the leader and build its log and snapshots again, although there will
> be a performance hit on the entire ensemble while this is taking
> place. Is this correct ?

Minimal performance hit really. The leader is streaming the latest 
snap/log to the new zk server. Not much cpu overhead, minimal IO 
(sequential read of the file), hopefully your network isn't maxed out, 
etc.... This is going on in parallel while the rest of the ensemble 
continues to process requests (as long as quorum has been maintained of 
course).

Patrick

> 
> --------------------------------------------------------------------------
>  NOTICE: If received in error, please destroy, and notify sender.
> Sender does not intend to waive confidentiality or privilege. Use of
> this email is prohibited when received in error. We may monitor and
> store emails to the extent permitted by applicable law.