Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 26849 invoked from network); 11 May 2010 19:51:58 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 11 May 2010 19:51:58 -0000 Received: (qmail 4653 invoked by uid 500); 11 May 2010 19:51:58 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 4510 invoked by uid 500); 11 May 2010 19:51:58 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 4502 invoked by uid 99); 11 May 2010 19:51:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 May 2010 19:51:57 +0000 X-ASF-Spam-Status: No, hits=2.9 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [74.125.83.176] (HELO mail-pv0-f176.google.com) (74.125.83.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 May 2010 19:51:51 +0000 Received: by pvg13 with SMTP id 13so677426pvg.35 for ; Tue, 11 May 2010 12:51:29 -0700 (PDT) MIME-Version: 1.0 Received: by 10.143.26.19 with SMTP id d19mr4105164wfj.160.1273607489545; Tue, 11 May 2010 12:51:29 -0700 (PDT) Received: by 10.142.188.16 with HTTP; Tue, 11 May 2010 12:51:29 -0700 (PDT) In-Reply-To: <793800.52883.qm@web65508.mail.ac4.yahoo.com> References: <793800.52883.qm@web65508.mail.ac4.yahoo.com> Date: Tue, 11 May 2010 12:51:29 -0700 Message-ID: Subject: Re: Using HBase on other file systems From: Jeff Hammerbacher To: hbase-user@hadoop.apache.org, apurtell@apache.org Content-Type: multipart/alternative; boundary=001636e0b142ff6776048656d9de X-Virus-Checked: Checked by ClamAV on apache.org --001636e0b142ff6776048656d9de Content-Type: text/plain; charset=ISO-8859-1 Hey, Thanks for the evaluation, Andrew. Ceph certainly is elegant in design; HDFS, similar to GFS [1], was purpose-built to get into production quickly, so its current incarnation lacks some of the same elegance. On the other hand, there are many techniques for making the metadata servers scalable and highly available. HDFS has the advantage of already storing hundreds of petabytes across thousands of organizations, so we're able to guide those design decisions with empirical data from heavily used clusters. We'd love to have heavy users of HBase contribute to the discussions of scalability [2] and availability [3] of HDFS. See also the excellent article from Konstantin Schvako of Yahoo! on HDFS scalability [4]. I've also conducted extensive reviews at both Facebook and now at Cloudera of alternative file systems, but at this stage, I concur with Andrew: HDFS is the only reasonable open source choice for production data processing workloads. I'm also optimistic that the scalability and availability challenges will be addressed by the (very active and diverse) HDFS developer community over the next few years, and we'll benefit from the work that's already been put into the robustness and manageability of the system. Regardless, every technology improves more rapidly when there's strong competition, so it will be good to see one of these other file systems emerge as a viable alternative to HDFS for HBase storage some day. [1] http://cacm.acm.org/magazines/2010/3/76283-gfs-evolution-on-fast-forward/fulltext [2] https://issues.apache.org/jira/browse/HDFS-1051 [3] https://issues.apache.org/jira/browse/HDFS-1064 [4] http://developer.yahoo.net/blogs/hadoop/2010/05/scalability_of_the_hadoop_dist.html Later, Jeff On Sun, May 9, 2010 at 9:44 AM, Andrew Purtell wrote: > Our experience with Gluster 2 is that self heal when a brick drops off the > network is very painful. The high performance impact lasts for a long time. > I'm not sure but I think Gluster 3 may only rereplicate missing sections > instead of entire files. On the other hand I would not trust Gluster 3 to be > stable (yet). > > I've also tried KFS. My experience seem to bear out other observations that > it is ~30% slower that HDFS. Also I was unable to keep the chunkservers up > on my CentOS 5 based 64 bit systems. I give Sriram shell access so he could > poke around coredumps with gdb but there was no satisfactory resolution. > > Another team at Trend is looking at Ceph. I think it is a highly promising > filesystem but at the moment it is an experimental filesystem undergoing a > high rate of development that requires another experimental filesystem > undergoing a high rate of development (btrfs) for recovery semantics, and > the web site warns "NOT SAFE YET" or similar. I doubt it has ever been > tested on clusters > 100 nodes. In contrast, HDFS has been running in > production on clusters with 1000s of nodes for a long time. > > There currently is not a credible competitor to HDFS in my opinion. Ceph is > definitely worth keeping an eye on however. I wonder if HDFS will evolve to > offer a similar scalable metadata service (NameNode) to compete. Certainly > that would improve its scalability and availability story, both issues today > presenting barriers to adoption, and barriers for anything layered on top, > like HBase. > > - Andy > > > > From: Kevin Apte > > Subject: Using HBase on other file systems > > To: hbase-user@hadoop.apache.org > > Date: Sunday, May 9, 2010, 5:08 AM > > > > I am wondering if anyone has thought > > about using HBase on other file systems like "Gluster". I > > think Gluster may offer much faster performance without > > exorbitant cost. With Gluster, you would have to > > fetch the data from the "Storage Bricks" and process it in > > your own environment. This allows the > > servers that are used as storage nodes very cheap. > > > > > > --001636e0b142ff6776048656d9de--