hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: Ok to share ZK nodes with Hadoop nodes?
Date Mon, 08 Mar 2010 19:21:12 GMT
See the troubleshooting page, some apropos detail there (esp relative to 
virtual env).

http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting

ZK servers are sensitive to IO (disk/network) latency. As long as you 
aren't very sensitive latency requirements it should be fine. If the 
machine were to swap for example, or the JVM were to go into long term 
GC (visualization in particular kills jvm gc) that would be bad.

Best practice for "on-line production serving" is 5 dedicated hosts with 
"shared nothing", physically distributed thoughout the data center (5 
hosts in a rack might not be the best idea for super reliability). 
There's alot of lee-way though, many ppl run with 3 and spof on switch 
for example.

Patrick

David Rosenstrauch wrote:
> I'm contemplating an upcoming zookeeper rollout and was wondering what 
> the zookeeper brain trust here thought about a network deployment question:
> 
> Is it generally considered bad practice to just deploy zookeeper on our 
> existing hdfs/MR nodes?  Or is it better to run zookeeper instances on 
> their own dedicated nodes?
> 
> On the one hand, we're not going to be making heavy-duty use of 
> zookeeper, so it might be sufficient for zookeeper nodes to share box 
> resources with HDFS & MR.  On the other hand, though, I don't want 
> zookeeper to become unavailable if the nodes are running a resource 
> intensive job that's hogging CPU or network.
> 
> 
> What's generally considered best practice for Zookeeper?
> 
> Thanks,
> 
> DR

Mime
View raw message