hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Seigel <ja...@tynt.com>
Subject Re: Distributed Clusters
Date Thu, 08 Apr 2010 17:02:09 GMT
Thanks for the insights into this stuff so far.  I think we are doing somethings right with
automating everything and such.  An additional question I have is: I have heard rhetoric about
zookeeper being able to help with configurations of hadoop?  I was wondering if anyone is
using zookeeper in a way that helps with their deployment of the hadoop cluster?


On 2010-04-08, at 4:18 AM, Steve Loughran wrote:

> James Seigel wrote:
>> I am new to this group, and relatively new to hadoop. I am looking at building a
large cluster.  I was wondering if anyone has any best practices for a cluster in the hundreds
of nodes?  As well, has anyone had experience with a cluster spanning multiple data centers.
 Is this a bad practice? moderately bad practice?  insane?
> got some stuff here
> http://wiki.smartfrog.org/wiki/display/sf/Patterns+of+Hadoop+Deployment
> though my clusters are of short life span and smaller. At that kind of scale you need
to know how to manage datacenters yourself or talk to people who do (I deny all knowledge,
though I will note that in HP consulting and EDS we do have people who can handle this)
>> Is it better to build the 1000 node cluster in a single data center?  
> yes.
>> Do you back one of these things up to a second data center or a different 1000 node
> depends on your concerns and where the building is.
> -If your facility is in the Bay Area then you want a separate datacentre on a different
fault line. If it's in Easter WA or OR then you worry more about volcanic activity and spec
the roof to take 1-2m of volcanic ash. Power comes off the big dams which again may go down
if there's an earthquake, but otherwise pretty reliable.
> -if your worry is about continuous availability, you need different sites with different
(multiple) power suppliers and multiple data feeds, and more to worry about in terms of keeping
things in sync. Data transfer will cost time and money, and for a big enough cluster -1000
servers can go up to 6-12 PB of storage, which takes time to sync. Even with the CERN LHC
experiments data rate of 1 PB/month off the LHC, it would take 6 months to get the data in
to your cluster using a good protocol like GridFTP.
> -single site would make sync easier, 10GB ethernet will still take a while but not cost
>> Sorry, I am asking crazy questions...I am just wanting to learn the meta issues and
opportunities with making clusters.
> Start small, automate everything, worry about scaling up the management problems. Hadoop
filestore and JT scales well, but you have to get your ops right. That's everything from BIOS
upgrades to log file management.

James Seigel
Captain Hammer

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message