hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: total # of zknodes
Date Thu, 15 Jul 2010 16:10:39 GMT
I've done some tests with ~600 clients creating 5 million znodes (size 
100bytes iirc) and 25million watches. I was using 8gb of memory for 
this, however --- in this scenario it's critical that you tune the GC, 
in particular you need to turn on CMS and incremental GC options. Otw 
when the GC collects it will collect for long periods of time and all of 
your clients will then time out. Keep an eye on the max latency of your 
servers, that's usually the most obvious indication of GC hits (it will 
spike up).

You can use the latency tester from here to do the quick benchmarks Ben 
suggested:
http://github.com/phunt/zk-smoketest
also see: http://bit.ly/4ekN8G

Patrick

On 07/15/2010 08:57 AM, Benjamin Reed wrote:
> i think there is a wiki page on this, but for the short answer:
>
> the number of znodes impact two things: memory footprint and recovery
> time. there is a base overhead to znodes to store its path, pointers to
> the data, pointers to the acl, etc. i believe that is around 100 bytes.
> you cant just divide your memory by 100+1K (for data) though, because
> the GC needs to be able to run and collect things and maintain a free
> space. if you use 3/4 of your available memory, that would mean with 4G
> you can store about three million znodes. when there is a crash and you
> recover, servers may need to read this data back off the disk or over
> the network. that means it will take about a minute to read 3G from the
> disk and perhaps a bit more to read it over the network, so you will
> need to adjust your initLimit accordingly.
>
> of course this is all back-of-the-envelope. i would suggest doing some
> quick benchmarks to test and make sure your results are in line with
> expectation.
>
> ben
>
> On 07/15/2010 02:56 AM, Maarten Koopmans wrote:
>> Hi,
>>
>> I am mapping a filesystem to ZooKeeper, and use it for locking and
>> mapping a filesystem namespace to a flat data object space (like S3).
>> So assuming proper nesting and small ZooKeeper nodes (< 1KB), how many
>> nodes could a cluster with a few GBs of memory per instance
>> realistically hold totally?
>>
>> Thanks, Maarten
>

Mime
View raw message