Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A8F001084A for ; Tue, 16 Dec 2014 19:55:22 +0000 (UTC) Received: (qmail 5019 invoked by uid 500); 16 Dec 2014 19:55:20 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 4957 invoked by uid 500); 16 Dec 2014 19:55:19 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 4944 invoked by uid 99); 16 Dec 2014 19:55:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Dec 2014 19:55:18 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yuzhihong@gmail.com designates 209.85.213.51 as permitted sender) Received: from [209.85.213.51] (HELO mail-yh0-f51.google.com) (209.85.213.51) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Dec 2014 19:55:14 +0000 Received: by mail-yh0-f51.google.com with SMTP id a41so6424167yho.24 for ; Tue, 16 Dec 2014 11:53:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=sq/5C1SKT8ehyMlSNRjImQwv/8ZNjWAKGMNT6WF4+JQ=; b=fPJepf28n8yZo68uAZA9640oyg/w4uzLZyvfMNkdvvcYcUye2MEchzYL9DuUkAD3A/ Lw+laNH4goHi3+xBnkLpmiWayJS0ln2hZsRBKBz4ikf91jVTMXCMWzNuoo0MEDdirb9N TqaOyH0mucYNqjASP/wS6qcPMKJa3kr4hc3fkNmA5Wt7muEPz+Z8Y5Z9xu2Hm5Rcex34 0L9kUrbjpxd6GeDS2GYN6+tFPkihFuBBfniQcYFfbZjlIWMnHoUdypgxRRcgrF1GMcjv wYryruiBejS6uo+KerDR+JFZ+IWt0LCWu2/Kjd4J9h1z+REEHqOz3nSx+hAhj8ZIa1/X d1eg== MIME-Version: 1.0 X-Received: by 10.236.32.8 with SMTP id n8mr28437615yha.74.1418759603811; Tue, 16 Dec 2014 11:53:23 -0800 (PST) Received: by 10.170.145.67 with HTTP; Tue, 16 Dec 2014 11:53:23 -0800 (PST) In-Reply-To: References: <369906224.608789.1418694750848.JavaMail.yahoo@jws10627.mail.bf1.yahoo.com> Date: Tue, 16 Dec 2014 11:53:23 -0800 Message-ID: Subject: Re: 0.94 going forward From: Ted Yu To: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=001a11c1c13434fb72050a5ab6ac X-Virus-Checked: Checked by ClamAV on apache.org --001a11c1c13434fb72050a5ab6ac Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable bq. Then just fail over the applications to the 0.98 cluster However, the application needs to compiled with 0.98 jars because the RPC has changed. Cheers On Tue, Dec 16, 2014 at 11:42 AM, Esteban Gutierrez wrote: > > +1 Andrew. Is not a simple task and is error prone and can cause dataloss > if not performed correctly, also we don't have tooling to fix broken > snapshots if moved manually. > > BTW 0.98 should migrate an old snapshot dir to the new post-namespaces > directory hierarchy after starting HBase from a 0.94 layout. If the goal = is > to minimize downtime probably this should be a better approach: bootstrap > the destination cluster with 0.94 with snapshots and replication enabled, > then use the ExportSnapshot tool to copy the snapshots, import the > snapshots and use replication on the remote cluster until the delta is > minimal. Then stop the destination cluster and upgrade to 0.98 (that shou= ld > take care of migrating everything without user intervention). Once the > destination cluster is migrated to 0.98 then use the replication bridge > tool to catch up with the new edits and reduce the delta between both > clusters. Then just fail over the applications to the 0.98 cluster. The > repeat the upgrade process in the 0.94 cluster. > > my 2 cents, > esteban. > > > > > > > -- > Cloudera, Inc. > > > On Tue, Dec 16, 2014 at 9:37 AM, Andrew Purtell > wrote: > > > > I disagree. Before adding something like that to the ref guide, we shou= ld > > actually agree to support it as a migration strategy. We're not there > yet. > > And although it's a heroic process, we can take steps to make it less > > kludgy if so. > > > > On Tue, Dec 16, 2014 at 9:27 AM, Ted Yu wrote: > > > > > > Good summary, Brian. > > > > > > This should be added to ref guide. > > > > > > Cheers > > > > > > On Tue, Dec 16, 2014 at 4:17 AM, Brian Jeltema < > > > brian.jeltema@digitalenvoy.net> wrote: > > > > > > > > I have been able to export snapshots from 0.94 to 0.98. I=E2=80=99v= e pasted > the > > > > instructions that I developed > > > > > > > > and published on our internal wiki. I also had to significantly > > increase > > > > retry count parameters due to > > > > > > > > a high number of timeout failures during the export. > > > > > > > > > > > > Cross-cluster transfers > > > > > > > > To export a snapshot to a different cluster: > > > > > > > > hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot > > snappy > > > > -copy-to proto://remhost/apps/hbase/data -mappers nummaps > > > > > > > > where snappy is the local snapshot to export, remhost is the host > name > > > the > > > > default filesystem in the remote cluster which is the target of the > > > export > > > > (i.e. the value of fs.defaultFS), nummaps is the number of mappers = to > > run > > > > to perform the export, and proto is the protocol to use, either hft= p, > > > hdfs > > > > or webhdfs. Use hdfs if the clusters are compatible. Run this as th= e > > > hbase > > > > user. If you see exceptions being thrown during the transfer relate= d > to > > > > lease expirations, reduce the number of mappers or try using the > > > -bandwidth > > > > option. You may also see many "File does not exist" warnings in the > log > > > > output. These can be displayed continuously for several minutes for= a > > > large > > > > table, but appear to be noise and can be ignored, so be patient. > > However, > > > > it is also very common for this command to fail due to a variety of > > file > > > > ownership conflicts, so you may need to fiddle to get everything > > right. A > > > > failure of this command often leaves garbage on the target system > that > > > must > > > > be deleted; if that is the case, the command will fail with info on > > what > > > > needs to be cleaned up. > > > > > > > > If exporting from an HBase 0.94 cluster to an HBase 0.98 cluster, y= ou > > > will > > > > need to use the webhdfs protocol (or possibly hftp, though I couldn= =E2=80=99t > > get > > > > that to work). You also need to manually move some files around > because > > > > snapshot layouts have changed. Based on the example above, on the > 0.98 > > > > cluster do the following: > > > > > > > > check whether any imports already exist for the table: > > > > > > > > hadoop fs -ls /apps/hbase/data/archive/data/default > > > > > > > > and check whether snappyTable is listed, where snappyTable is th= e > > > > source of the snapshot (e.g. hosts). If the source table is listed, > > then > > > > merge the new ssnapshot data into the existing snapshot data: > > > > > > > > hadoop fs -mv /apps/hbase/data/.archive/snappyTable/* > > > > /apps/hbase/data/archive/data/default/snappyTable > > > > > > > > hadoop fs -rm r /apps/hbase/data/.archive/snappyTable > > > > > > > > otherwise, create and populate the snapshot data directory: > > > > > > > > hadoop fs -mv /apps/hbase/data/.archive/snappyTable > > > > /apps/hbase/data/archive/data/default (snappyTable is the source of > > > snappy) > > > > > > > > in either case, update the snapshot metadata files as follows: > > > > > > > > hadoop fs -mkdir > > /apps/hbase/data/.hbase-snapshot/snappy/.tabledesc > > > > > > > > hadoop fs -mv > > > > /apps/hbase/data/.hbase-snapshot/snappy/.tableinfo.0000000001 > > > > /apps/hbase/data/.hbase-snapshot/snappy/.tabledesc > > > > > > > > at this point, you should be able to do a restore_snapshot from th= e > > > HBase > > > > shell. > > > > > > > > > > > > On Dec 15, 2014, at 8:52 PM, lars hofhansl wrote= : > > > > > > > > > Nope :(Replication uses RPC and that was changed to protobufs. > AFAIK > > > > snapshots can also not be exported from 0.94 and 0.98. We have a > really > > > > shitty story here. From: Sean Busbey > > > > > To: user > > > > > Sent: Monday, December 15, 2014 5:04 PM > > > > > Subject: Re: 0.94 going forward > > > > > > > > > > Does replication and snapshot export work from 0.94.6+ to a 0.96 = or > > > 0.98 > > > > > cluster? > > > > > > > > > > Presuming it does, shouldn't a site be able to use a multiple > cluster > > > set > > > > > up to do a cut over of a client application? > > > > > > > > > > That doesn't help with needing downtime for to do the eventual > > upgrade, > > > > but > > > > > it mitigates the impact on the downstream app. > > > > > > > > > > -- > > > > > Sean > > > > > > > > > > > > > > > On Dec 15, 2014 6:51 PM, "Jeremy Carroll" > > wrote: > > > > > > > > > >> Which is why I feel that a lot of customers are still on 0.94. > > Pretty > > > > much > > > > >> trapped unless you want to take downtime for your site. Any type > of > > > > >> guidance would be helpful. We are currently in the process of > > > designing > > > > our > > > > >> own system to deal with this. > > > > >> > > > > >> On Mon, Dec 15, 2014 at 4:47 PM, Andrew Purtell < > > apurtell@apache.org> > > > > >> wrote: > > > > >>> > > > > >>> Zero downtime upgrade from 0.94 won't be possible. See > > > > >>> http://hbase.apache.org/book.html#d0e5199 > > > > >>> > > > > >>> > > > > >>> On Mon, Dec 15, 2014 at 4:44 PM, Jeremy Carroll < > > phobos182@gmail.com > > > > > > > > >>> wrote: > > > > >>>> > > > > >>>> Looking for guidance on how to do a zero downtime upgrade from > > 0.94 > > > -> > > > > >>> 0.98 > > > > >>>> (or 1.0 if it launches soon). As soon as we can figure this ou= t, > > we > > > > >> will > > > > >>>> migrate over. > > > > >>>> > > > > >>>> On Mon, Dec 15, 2014 at 1:37 PM, Esteban Gutierrez < > > > > >> esteban@cloudera.com > > > > >>>> > > > > >>>> wrote: > > > > >>>>> > > > > >>>>> Hi Lars, > > > > >>>>> > > > > >>>>> Thanks for bringing this for discussion. From my experience I > can > > > > >> tell > > > > >>>> that > > > > >>>>> 0.94 is very stable but that shouldn't be a blocker to consid= er > > to > > > > >>>> EOL'ing. > > > > >>>>> Are you considering any specific timeframe for that? > > > > >>>>> > > > > >>>>> thanks, > > > > >>>>> esteban. > > > > >>>>> > > > > >>>>> -- > > > > >>>>> Cloudera, Inc. > > > > >>>>> > > > > >>>>> > > > > >>>>> On Mon, Dec 15, 2014 at 11:46 AM, Koert Kuipers < > > koert@tresata.com > > > > > > > > >>>> wrote: > > > > >>>>>> > > > > >>>>>> given that CDH4 is hbase 0.94 i dont believe nobody is using > it. > > > > >> for > > > > >>>> our > > > > >>>>>> clients the majority is on 0.94 (versus 0.96 and up). > > > > >>>>>> > > > > >>>>>> so i am going with 1), its very stable! > > > > >>>>>> > > > > >>>>>> On Mon, Dec 15, 2014 at 1:53 PM, lars hofhansl < > > larsh@apache.org> > > > > >>>> wrote: > > > > >>>>>>> > > > > >>>>>>> Over the past few months the rate of the change into 0.94 h= as > > > > >>> slowed > > > > >>>>>>> significantly. > > > > >>>>>>> 0.94.25 was released on Nov 15th, and since then we had onl= y > 4 > > > > >>>> changes. > > > > >>>>>>> > > > > >>>>>>> This could mean two things: (1) 0.94 is very stable now or > (2) > > > > >>> nobody > > > > >>>>> is > > > > >>>>>>> using it (at least nobody is contributing to it anymore). > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> If anybody out there is still using 0.94 and is not plannin= g > to > > > > >>>> upgrade > > > > >>>>>> to > > > > >>>>>>> 0.98 or later soon (which will required downtime), please > speak > > > > >> up. > > > > >>>>>>> Otherwise it might be time to think about EOL'ing 0.94. > > > > >>>>>>> > > > > >>>>>>> It's not actually much work to do these releases, especiall= y > > when > > > > >>>> they > > > > >>>>>> are > > > > >>>>>>> so small, but I'd like to continue only if they are actuall= y > > > > >> used. > > > > >>>>>>> In any case, I am going to spin 0.94.26 with the current 4 > > fixes > > > > >>>> today > > > > >>>>> or > > > > >>>>>>> tomorrow. > > > > >>>>>>> > > > > >>>>>>> -- Lars > > > > >>>>>>> > > > > >>>>>> > > > > >>>>> > > > > >>>> > > > > >>> > > > > >>> > > > > >>> -- > > > > >>> Best regards, > > > > >>> > > > > >>> - Andy > > > > >>> > > > > >>> Problems worthy of attack prove their worth by hitting back. - > Piet > > > > Hein > > > > >>> (via Tom White) > > > > >>> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > Best regards, > > > > - Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hei= n > > (via Tom White) > > > --001a11c1c13434fb72050a5ab6ac--