Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 53D8A786C for ; Wed, 28 Sep 2011 18:50:21 +0000 (UTC) Received: (qmail 69234 invoked by uid 500); 28 Sep 2011 18:50:19 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 69193 invoked by uid 500); 28 Sep 2011 18:50:19 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 69185 invoked by uid 99); 28 Sep 2011 18:50:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Sep 2011 18:50:19 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of a.minor.internet@gmail.com designates 209.85.210.169 as permitted sender) Received: from [209.85.210.169] (HELO mail-iy0-f169.google.com) (209.85.210.169) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Sep 2011 18:50:13 +0000 Received: by iaen33 with SMTP id n33so11602303iae.14 for ; Wed, 28 Sep 2011 11:49:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:content-type; bh=FY9LrTwkSU8cJAQzOWQLTMbYLCpf2HxhTiqzRG8B000=; b=rChe0yIPgLqbJMvI1YiCF+i1u4K3PYBcrKwM4Ze8ov77d8ySvlwR/4hWEANYYp8fZL aYVeRul0m0pXv8yo77lFAftmG8MIvzixk9DbHDV7cntQcpW7gYr1onKQ9ysyUyRw/Xo6 ND0hnwrS96wAJ89DyMHEBxpgHdjNq6wa477TA= Received: by 10.231.41.9 with SMTP id m9mr4680381ibe.96.1317235792142; Wed, 28 Sep 2011 11:49:52 -0700 (PDT) MIME-Version: 1.0 Sender: a.minor.internet@gmail.com Received: by 10.42.224.195 with HTTP; Wed, 28 Sep 2011 11:49:31 -0700 (PDT) In-Reply-To: References: From: Li Pi Date: Wed, 28 Sep 2011 11:49:31 -0700 X-Google-Sender-Auth: Ye3-qFv72yLTC0lolehyyVmtsJ8 Message-ID: Subject: Re: Recommended backup/restore solution for hbase To: user@hbase.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org What kind of situations are you looking for to guard against? Partial hardware failure, full hardware failure (of live cluster), accidentally deleting all data? HDFS provides replication that already guards against partial hardware failure - if this is all you need, a ephemeral store should be fine. Also, HBase can use S3 directly as a datastore. You can choose the raw mode, in which HBase treats S3 as a disk. There used to be a block based mode as well, but now as S3 has increased the object size limit to 5tb, this isn't needed anymore. (Somebody correct me if i'm wrong). On Wed, Sep 28, 2011 at 9:15 AM, Vinod Gupta Tankala wrote: > Hi, > Can someone answer these basic but important questions for me. > We are using hbase for our datastore and want to safeguard ourselves from > data corruption/data loss. Also we are hosted on aws ec2. Currently, I only > have a single node but want to prepare for scale right away as things are > going to change starting next couple of weeks. Also, I am currently using > ephemeral store for hbase data. > > 1) What is the recommended aws data store method for hbase? should you use > ephemeral store and do S3 backups or use EBS? I read and heard that EBS can > be expensive and also unreliable in terms of read/write latency. Ofcourse, > it provides data replication and protection, so you don't have to worry > about that. > > 2) What is the recommended backup/restore method for hbase? I would like to > take periodic data snapshots and then have a import utility that will > incrementally import data in case i lose some regions due to corruption or > table inconsistencies. also, if something catastrophic happens, i can > restore the whole data. > > 3) While we are at it, what is the recommended ec2 instance types for > running master/zookeeper/region servers? i get conflicting answers from > google search - ranging from c1.xlarge to m1.xlarge. > > I would really appreciate if someone could help me. > > thanks > vinod >