hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Query regarding Hadoop and cloud infrastructure
Date Thu, 15 Apr 2010 12:15:18 GMT
Goel, Mohit IN BLR SISL wrote:
> Hello,
> I have a general query regarding usage of Hadoop with my cloud infrastructure. I am trying
to achieve scaling up and scaling down in cloud using Hadoop.
> I have set up a cloud infrastructure which creates images consists of OS and applications.
To access user applications, instance of image has to launch. Now I want to make this running
or launched instance scalable based on some condition like -
> a)       If no. of users who are accessing the application which is hosting in cloud
(i.e. in instance) are more then it should run one more instance of image and if no. of users
are less then instances should be terminated.
> b)       If CPU usage is more then one more instance of image should run or if CPU usage
is less then it should terminate the instance.
> Can I achieve these goals using Hadoop?

1. Hadoop on Demand, HOD, does some of this
2. Hadoop on EC2 does some of this
3. I've been doing some of this, too; I have some slides up where I 
discuss issues

One funny for Hadoop is that it likes locality, and it likes machines 
with TB of physical storage, which doesn't fit in quite as well with the 
VM-on-demand story. If you look at my slides, you can see that 
everything expects stable hostnames, reacts to failure by blacklisting, 
not by killing the VM and creating a new one with the same HDFS volumes 
mounted. There is room for improvement!

View raw message