Return-Path: X-Original-To: apmail-incubator-cloudstack-users-archive@minotaur.apache.org Delivered-To: apmail-incubator-cloudstack-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 24B73D259 for ; Mon, 10 Sep 2012 08:02:31 +0000 (UTC) Received: (qmail 19621 invoked by uid 500); 10 Sep 2012 08:02:30 -0000 Delivered-To: apmail-incubator-cloudstack-users-archive@incubator.apache.org Received: (qmail 19546 invoked by uid 500); 10 Sep 2012 08:02:30 -0000 Mailing-List: contact cloudstack-users-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cloudstack-users@incubator.apache.org Delivered-To: mailing list cloudstack-users@incubator.apache.org Received: (qmail 19536 invoked by uid 99); 10 Sep 2012 08:02:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Sep 2012 08:02:30 +0000 X-ASF-Spam-Status: No, hits=1.6 required=5.0 tests=RCVD_IN_BRBL_LASTEXT,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [206.125.172.14] (HELO sympanel.syminet.com) (206.125.172.14) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Sep 2012 08:02:25 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=shankerbalan.net; s=x; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:To:From:Date; bh=qvByV1Mg46gBnGe7r25ctBGWbyB7XfFV2jU/FzK4USI=; b=vOIqxgAVuRRgQTxu93WDu9LS0se3UzYTo3ZLBtUHQ2nJJETq7T6+I4AiFLmNk3cjmBw0YZvar+omlNfXFH11dwzvtoFlyTLUJQSRbCQl0i6JPCcA1Rj4L8aodiACICEH; Received: from shanu by sympanel.syminet.com with local (Exim 4.71) (envelope-from ) id 1TAywV-0006Ku-R2 for cloudstack-users@incubator.apache.org; Mon, 10 Sep 2012 01:02:03 -0700 Date: Mon, 10 Sep 2012 01:02:03 -0700 From: Shanker Balan To: cloudstack-users@incubator.apache.org Subject: Re: How to integrate Hadoop to CloudStack Message-ID: <20120910080203.GB23689@shankerbalan.net> References: <20120907080929.GA25403@shankerbalan.net> <20120910065308.GA23689@shankerbalan.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Organisation: http://shankerbalan.net/ User-Agent: Mutt/1.5.20 (2009-06-14) X-Antiabuse: This header was added to track abuse, please include it with any abuse report X-Antiabuse: Primary Hostname - sympanel.syminet.com X-Antiabuse: Original Domain - incubator.apache.org X-Antiabuse: Originator/Caller UID/GID - [1001 1001] / [105 113] X-Antiabuse: Sender Address Domain - shankerbalan.net Hello, Nguyen Anh Tu wrote, > Hi Shanker, > > You right. I just want to use Hadoop HDFS for CS, no need any sub project. > Namenode is exactly a single point of failure. I found a good solution > instead. That's GlusterFS, which have no master node. All node is peer. > However, CloudStack and Hadoop are both written by Java. They are belong to > ASF too. Personally, CloudStack and Hadoop are concerned. I had to do a bit of planning for the private cloud at my previous work and I pretty much went down the same line of thought for the storage as you did. NFS -> HDFS -> GlusterFS. Also considered DRBD. Unfortunately, CS does not support GlusterFS naively so you cant achieve HA. In order to use Gluster, you need to export the vols over NFS and you will run into all the limitations of NFS once more. I liked Isilon's OneFS solution as a replacement for Gluster though. Hth. > 2012/9/10 Hieu Le > > > > > > > ---------- Forwarded message ---------- > > From: Shanker Balan > > Date: Mon, Sep 10, 2012 at 1:53 PM > > Subject: Re: How to integrate Hadoop to CloudStack > > To: cloudstack-users@incubator.apache.org > > > > > > Hello, > > > > Nguyen Anh Tu wrote, > > > > > Hello Shanker, > > > > > > I mean that with CS, I want to replace NFS to HDFS. You know NFS is not a > > > suitable solution for storage, because it has not fault-tolerant feature. > > > So I want to use HDFS for Secondary Storage in CS. I see this > > > http://www.slideshare.net/kkitase/cloudstack-architecture-future. I > > think > > > in near future, Hadoop will be used as a storage solution in CS. > > > > You mean "HDFS" as a storage solution for CS? HDFS is just one of the > > components in the Hadoop project. Hadoop also includes non storage sub > > projects like MR, Pig, ZK etc. > > > > Regarding HDFS, The NameNode machine is a single point of failure for an > > HDFS > > cluster at this time. See > > > > http://hadoop.apache.org/common/docs/r0.20.2/hdfs_design.html#Metadata+Disk+Failure > > > > I have seen the NameNode fail many times at my previous $work place and its > > not fun. There is a HA NameNode solution in the works but I dont think its > > reached stable status. > > > > http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/ > > > > In the end its about the scalability, availability, manageability and cost > > considerations you wish to achieve that decides every aspect of your cloud > > solution. > > > > Regards. > > > > > > > 2012/9/7 Shanker Balan > > > > > > > (Moving to cloudstack-users@ with Bcc to > > > > cloudstack-dev@incubator.apache.org) > > > > > > > > Hello Nguyen, > > > > > > > > Nguyen Anh Tu wrote, > > > > > Hi guy, > > > > > > > > > > Anyone can help me to integrate Hadoop to CloudStack. I read the > > article > > > > > "CloudStack and Hadoop: a match made in the cloud" but can not find > > a way > > > > > to do this. > > > > > > > > Could you explain a bit more on what you mean by "Integrating Hadoop To > > > > Cloudstack"? I am not using CS yet, but I have a bunch of use cases I > > have > > > > been thinking about lately. > > > > > > > > You can use Cloudstack to provision Hadoop instances very easily. > > > > Cloudstack's > > > > bare metal provisioning capabilities allows you to build high > > performance > > > > clusters. > > > > > > > > > > > > > > http://www.cloudstack.org/blog/63-cloudstack-the-best-kept-secret-in-cloud-computing.html.html > > > > > > > > Cloudstack also provides an S3 compatible interface over supported > > object > > > > stores like Swift and Caringo. So instead of using HDFS, you can > > choose to > > > > store your data on CS backed by object store+s3 bridge. > > > > > > > > http://www.slideshare.net/sebastiengoasguen/cloudstack-s3 > > > > http://wiki.apache.org/hadoop/AmazonS3 > > > > > > > > On the other hand, if you are expecting a hosted Hadoop solution (like > > AWS > > > > EMR), I dont think that's quite ready yet (or if its even on the > > roadmap > > > > anytime soon). > > > > > > > > -- > > > > http://shankerbalan.net/ > > > > > > > > PS: cloudstack-users@ might be a more appropriate list to discuss this > > > > further. > > > > > > > > > > > > > > > > -- > > > > > > N.g.U.y.e.N.A.n.H.t.U > > > > -- > > http://shankerbalan.net/ > > > > > > > > -- > > ..:: Hieu LE ::.. > > > > Class: Information System - Course 52 > > School of Information and Communication Technology > > Hanoi University of Technology > > No 1, Dai Co Viet street - Hai Ba Trung district - Hanoi > > > > High Performance Computing Center > > Cloud Computing Group > > Gmail: hieulq89@gmail.com > > > > > > -- > > N.g.U.y.e.N.A.n.H.t.U -- http://shankerbalan.net/