hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: How to manage a large cluster?
Date Fri, 12 Sep 2008 21:41:55 GMT
James Moore wrote:
> On Thu, Sep 11, 2008 at 5:46 AM, Allen Wittenauer <aw@yahoo-inc.com> wrote:
>> On 9/11/08 2:39 AM, "Alex Loddengaard" <alexlod@google.com> wrote:
>>> I've never dealt with a large cluster, though I'd imagine it is managed the
>>> same way as small clusters:
>>    Maybe. :)

Depends how often you like to be paged, doesn't it :)

> 
>>    Instead, use a real system configuration management package such as
>> bcfg2, smartfrog, puppet, cfengine, etc.  [Steve, you owe me for the plug.
>> :) ]

Yes Allen, I owe you beer at the next apachecon we are both at.
Actually, I think Y! were one of the sponsors at the UK event, so we owe 
you for that too.


> Or on EC2 and its competitors, just build a new image whenever you
> need to update Hadoop itself.


1. It's still good to have as much automation of your image build as you 
can; if you can build new machine images on demand you have have 
fun/make a mess of things. Look at http://instalinux.com to see the web 
GUI for creating linux images on demand that is used inside HP.

2. When you try and bring up everything from scratch, you have a 
choreography problem. DNS needs to be up early, and your authentication 
system, the management tools, then the other parts of the system. If you 
have a project where hadoop is integrated with the front end site, for 
example, you're app servers have to stay offline until HDFS is live. So 
it does get complex.

3. The Hadoop nodes are good here in that you aren't required to bring 
up the namenode first; the datanodes will wait; same for the task 
trackers and job tracker. But if you, say, need to point everything at a 
new hostname for the namenode, well, that's a config change that needs 
to be pushed out, somehow.



I'm adding some stuff on different ways to deploy hadoop here:

http://wiki.smartfrog.org/wiki/display/sf/Patterns+of+Hadoop+Deployment

-steve

Mime
View raw message