Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BC67D9A51 for ; Fri, 24 Feb 2012 07:28:12 +0000 (UTC) Received: (qmail 30570 invoked by uid 500); 24 Feb 2012 07:28:11 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 30512 invoked by uid 500); 24 Feb 2012 07:28:11 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 30504 invoked by uid 99); 24 Feb 2012 07:28:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Feb 2012 07:28:11 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_FORGED_REPLYTO,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [98.138.91.188] (HELO nm28-vm4.bullet.mail.ne1.yahoo.com) (98.138.91.188) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 24 Feb 2012 07:28:03 +0000 Received: from [98.138.90.57] by nm28.bullet.mail.ne1.yahoo.com with NNFMP; 24 Feb 2012 07:27:42 -0000 Received: from [98.138.89.166] by tm10.bullet.mail.ne1.yahoo.com with NNFMP; 24 Feb 2012 07:27:42 -0000 Received: from [127.0.0.1] by omp1022.mail.ne1.yahoo.com with NNFMP; 24 Feb 2012 07:27:42 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 331396.24816.bm@omp1022.mail.ne1.yahoo.com Received: (qmail 18277 invoked by uid 60001); 24 Feb 2012 07:27:42 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1330068462; bh=swPr2KA0Lm2Y3ROI1giBB2T0zSN4ZMHprFyvdB89ApE=; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=eP1pfAUEKE0jRp/pOUx2CZMvnFibV70qihZ+i2Bodv6S+7kx2y0TKYErgsYRoVGXFq0MV5pHOKJW1Dyat+B82VmzdSAPyvXCr1XVaoGmnC5lSfBHtTzJqqic9GlYoZUPz3UfZSamI1EMFcaRW9ueXO1YxHsOogBj+gof/PIu7c8= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=P7t0ITyeKLSjHV44MbdL65N2i5xQ2/KPzlE7xM/6a7nbzG0FQrwwd0CznZPYJRFzKPYn97TlBc94ZimpHvuJYQOZd5iBzoa14vFp/USUS3dra2Hig5ch/2qOyarvVJo/LhKzLqRlbOXrII6ZLKNR5HhhAr4GYf7jxdFclGd5310=; X-YMail-OSG: CsQrDiIVM1n6I34mHZavUyFH7KNV01MH2WzL_EYCf320jxD Vka.w8TzHIXo6MEqvHHEu.aUvtbIPo.ZOKdrqiUsgOvanYgtTkWJo6XaZTX9 OHJnUt4VOlRSHtgBFw9jyeT7iRuS3j.wn6FLPecLO0Q26pNfOZpnX.Nf1ZSg WKKye73Cj8axTIfuvDx.nwTrZ0F5oFEvl47eMVTRIILOpTn4veLJHJZh_5eI 2XjXvFdB1MqUG1OruNWLpII.K.M_I1RaxREZ5jd3g0OviE4QfCV8Sp3ipems JDyNBzwK6hZvYi.gV634PPo0r_bwwgcLNkzamAaXjFFG2MLKl1W1uOKT2rLo f6w495iL3dkK.63dvLUGEMYE6I1bgUgQSzvWWfoOJrPq_PKqELU7O44IGfZH n2GhZpVGxHDMte9QvrcuOy6g5eChgsbRSTOHRH8Ttbkl6Hf6JwvpP4p7pRpR 1rKXEL5jqnv0Zcb52C5wA_ET4Gg-- Received: from [69.181.180.38] by web121706.mail.ne1.yahoo.com via HTTP; Thu, 23 Feb 2012 23:27:42 PST X-Mailer: YahooMailWebService/0.8.116.338427 References: Message-ID: <1330068462.94620.YahooMailNeo@web121706.mail.ne1.yahoo.com> Date: Thu, 23 Feb 2012 23:27:42 -0800 (PST) From: lars hofhansl Reply-To: lars hofhansl Subject: Re: export/import for backup To: "user@hbase.apache.org" In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Recovery from these exported HFiles should be extremely fast.=0A=0AWe can a= dd an option to Export that export to HFiles instead using HFileOutputForma= t instead.=0ANote that you will eat the sort CPU at export time, though, wh= ich is (hopefully) more frequent than importing,=0Anot entirely sure that t= his is a good trade-off.=0A=0A-- Lars=0A=0A________________________________= =0AFrom: Jacques =0ATo: user@hbase.apache.org =0ASent: Tu= esday, February 21, 2012 8:22 AM=0ASubject: Re: export/import for backup=0A= =0AI was thinking about this and have a couple thoughts...=0A=0AWhile Stack= 's solution above would work, it means a couple things: 1. if=0Ayou haven't= saved splits, your going to have to figure out how to pre-split=0Afor a fu= ll restore.=A0 2. you have to wait for the data re-sort at recovery=0Atime = instead of backup time so recovery time will be substantially longer.=0A=0A= It seems like we should make a new script like export that automatically=0A= exports the data as bulk importable along with all of the table's schema=0A= and split information.=A0 We then could make an import script that simply= =0Acreates the backed up table (to potentially a different target name) and= =0Athen bulk imports it, pre-splitting using the splits defined on export.= =0A(We actually did something like this recently to migrate data from one= =0Aformat to another.)=0A=0AIt wouldn't work for the case where you are try= ing to do a merged restore=0A(e.g. pre-existing table) but it seems like re= covery would be really quick.=0AI suppose you could allow it to support imp= orting into an existing table=0Abut then you may have to wait for splits on= a bunch of the files (I know=0Athe bulk import script is designed to do th= is but i'm not sure how it would=0Ahandle a large amount of splits if your = target table has diverged=0Asubstantially from when the backup was done).= =0A=0AJacques=0A=0A=0AOn Mon, Feb 20, 2012 at 9:19 PM, Stack wrote:=0A=0A> On Mon, Feb 20, 2012 at 1:58 PM, Paul Mackles wrote:=0A> > Actually, an hbase export to "bulk load" facility s= ounds like a great=0A> idea. We have been using bulk loads to migrate data = from an older data=0A> store and they have worked awesome for us. It also d= oesn't seem like it=0A> would be that hard to implement. So what am I missi= ng?=0A> >=0A>=0A> Little?=0A>=0A> Check out the Import.java in mapreduce pa= ckage.=A0 See how its pulling=0A> from SequenceFiles into a map that output= s to a TableOutputFormat=0A> inside in the map.=A0 Make a new MR job that h= as same input but that=0A> outputs to HFileOutputFormat instead (you'll nee= d the total order=0A> partitioner and a reducer in the mix which Import doe= sn't have).=0A>=0A> St.Ack=0A>