incubator-cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shanker Balan <m...@shankerbalan.net>
Subject Re: How to integrate Hadoop to CloudStack
Date Mon, 10 Sep 2012 06:53:08 GMT
Hello,

Nguyen Anh Tu wrote,
> Hello Shanker,
> 
> I mean that with CS, I want to replace NFS to HDFS. You know NFS is not a
> suitable solution for storage, because it has not fault-tolerant feature.
> So I want to use HDFS for Secondary Storage in CS. I see this
> http://www.slideshare.net/kkitase/cloudstack-architecture-future. I think
> in near future, Hadoop will be used as a storage solution in CS.

You mean "HDFS" as a storage solution for CS? HDFS is just one of the
components in the Hadoop project. Hadoop also includes non storage sub
projects like MR, Pig, ZK etc.

Regarding HDFS, The NameNode machine is a single point of failure for an HDFS
cluster at this time. See
http://hadoop.apache.org/common/docs/r0.20.2/hdfs_design.html#Metadata+Disk+Failure

I have seen the NameNode fail many times at my previous $work place and its
not fun. There is a HA NameNode solution in the works but I dont think its
reached stable status.
http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/

In the end its about the scalability, availability, manageability and cost
considerations you wish to achieve that decides every aspect of your cloud
solution.

Regards.

> 2012/9/7 Shanker Balan <mail@shankerbalan.net>
> 
> > (Moving to cloudstack-users@ with Bcc to
> > cloudstack-dev@incubator.apache.org)
> >
> > Hello Nguyen,
> >
> > Nguyen Anh Tu wrote,
> > > Hi guy,
> > >
> > > Anyone can help me to integrate Hadoop to CloudStack. I read the article
> > > "CloudStack and Hadoop: a match made in the cloud" but can not find a way
> > > to do this.
> >
> > Could you explain a bit more on what you mean by "Integrating Hadoop To
> > Cloudstack"? I am not using CS yet, but I have a bunch of use cases I have
> > been thinking about lately.
> >
> > You can use Cloudstack to provision Hadoop instances very easily.
> > Cloudstack's
> > bare metal provisioning capabilities allows you to build high performance
> > clusters.
> >
> >
> > http://www.cloudstack.org/blog/63-cloudstack-the-best-kept-secret-in-cloud-computing.html.html
> >
> > Cloudstack also provides an S3 compatible interface over supported object
> > stores like Swift and Caringo. So instead of using HDFS, you can choose to
> > store your data on CS backed by object store+s3 bridge.
> >
> > http://www.slideshare.net/sebastiengoasguen/cloudstack-s3
> > http://wiki.apache.org/hadoop/AmazonS3
> >
> > On the other hand, if you are expecting a hosted Hadoop solution (like AWS
> > EMR), I dont think that's quite ready yet (or if its even on the roadmap
> > anytime soon).
> >
> > --
> > http://shankerbalan.net/
> >
> > PS: cloudstack-users@ might be a more appropriate list to discuss this
> > further.
> >
> 
> 
> 
> -- 
> 
> N.g.U.y.e.N.A.n.H.t.U

-- 
http://shankerbalan.net/

Mime
View raw message