hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Reed <br...@yahoo-inc.com>
Subject Re: total # of zknodes
Date Thu, 15 Jul 2010 15:57:14 GMT
i think there is a wiki page on this, but for the short answer:

the number of znodes impact two things: memory footprint and recovery 
time. there is a base overhead to znodes to store its path, pointers to 
the data, pointers to the acl, etc. i believe that is around 100 bytes. 
you cant just divide your memory by 100+1K (for data) though, because 
the GC needs to be able to run and collect things and maintain a free 
space. if you use 3/4 of your available memory, that would mean with 4G 
you can store about three million znodes. when there is a crash and you 
recover, servers may need to read this data back off the disk or over 
the network. that means it will take about a minute to read 3G from the 
disk and perhaps a bit more to read it over the network, so you will 
need to adjust your initLimit accordingly.

of course this is all back-of-the-envelope. i would suggest doing some 
quick benchmarks to test and make sure your results are in line with 
expectation.

ben

On 07/15/2010 02:56 AM, Maarten Koopmans wrote:
> Hi,
>
> I am mapping a filesystem to ZooKeeper, and use it for locking and mapping a filesystem
namespace to a flat data object space (like S3). So assuming proper nesting and small ZooKeeper
nodes (<  1KB), how many nodes could a cluster with a few GBs of memory per instance realistically
hold totally?
>
> Thanks, Maarten


Mime
View raw message