hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Segel, Mike" <mse...@navteq.com>
Subject RE: Query regarding Hadoop and cloud infrastructure
Date Thu, 15 Apr 2010 14:30:21 GMT
Outside of a EC2 or a commercial site which sells time on a 'cloud', I would argue against
trying to do HOD or build a dynamic cloud for a corporate environment.

Corporate clouds tend to be static in terms of usage. Meaning that they are being built for
a task and any changes are not dynamic enough to justify HOD.

I sat through a presentation from Sun. A nice guy, but in the end, I and others thought it
was a way to make Sun's hardware (Sorry err I mean Oracle) relevant in the Hadoop world. Its
counter to the concept of developing 'white box' commodity hardware.

I'm not sold on virtualization, but its just my opinion and not necessarily shared by anyone,
which means I need to make the following statement:

The opinions expressed in this post are mine and mine alone. They do not reflect the opinions
or position of my client, or my employer. Any resemblance to a rational coherent thought is
pure coincidence. 


-----Original Message-----
From: Steve Loughran [mailto:stevel@apache.org] 
Sent: Thursday, April 15, 2010 7:15 AM
To: general@hadoop.apache.org
Subject: Re: Query regarding Hadoop and cloud infrastructure

Goel, Mohit IN BLR SISL wrote:
> Hello,
> I have a general query regarding usage of Hadoop with my cloud infrastructure. I am trying
to achieve scaling up and scaling down in cloud using Hadoop.
> I have set up a cloud infrastructure which creates images consists of OS and applications.
To access user applications, instance of image has to launch. Now I want to make this running
or launched instance scalable based on some condition like -
> a)       If no. of users who are accessing the application which is hosting in cloud
(i.e. in instance) are more then it should run one more instance of image and if no. of users
are less then instances should be terminated.
> b)       If CPU usage is more then one more instance of image should run or if CPU usage
is less then it should terminate the instance.
> Can I achieve these goals using Hadoop?

1. Hadoop on Demand, HOD, does some of this
2. Hadoop on EC2 does some of this
3. I've been doing some of this, too; I have some slides up where I 
discuss issues

One funny for Hadoop is that it likes locality, and it likes machines 
with TB of physical storage, which doesn't fit in quite as well with the 
VM-on-demand story. If you look at my slides, you can see that 
everything expects stable hostnames, reacts to failure by blacklisting, 
not by killing the VM and creating a new one with the same HDFS volumes 
mounted. There is room for improvement!

The information contained in this communication may be CONFIDENTIAL and is intended only for
the use of the recipient(s) named above.  If you are not the intended recipient, you are hereby
notified that any dissemination, distribution, or copying of this communication, or any of
its contents, is strictly prohibited.  If you have received this communication in error, please
notify the sender and delete/destroy the original message and any copy of it from your computer
or paper files.

View raw message