Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (athena.apache.org: domain of yuzhihong@gmail.com
 designates 209.85.213.51 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CALPvCiAbpDgW14CV+EeKHhj5BnD_KNMgGBSsJWyQyyTrmmEDgw@mail.gmail.com>
References: 
 <CAGHyZ6LVBiqcszPhk=Q9=hM7rOrQhnEux1oVoECuF63xUajaVA@mail.gmail.com>
	<369906224.608789.1418694750848.JavaMail.yahoo@jws10627.mail.bf1.yahoo.com>
	<C18855DA-CB0F-4500-8D35-C79F781061BF@digitalenvoy.net>
	<CALte62y8myYQSpwXmi0USut_8ec8M_5Vcxzd9he5PARETz=QYg@mail.gmail.com>
	<CA+RK=_AL-2iKFx3ARDUZBN-DhLJEuDq+_Xx5S+hb+v6gh_NT2w@mail.gmail.com>
	<CALPvCiAbpDgW14CV+EeKHhj5BnD_KNMgGBSsJWyQyyTrmmEDgw@mail.gmail.com>
Date: Tue, 16 Dec 2014 11:53:23 -0800
Message-ID: 
 <CALte62wd7qyZtM_9yieSs=vObL7ojyk4D3TAw6nwEza8b5o+Mg@mail.gmail.com>
Subject: Re: 0.94 going forward
From: Ted Yu <yuzhihong@gmail.com>
To: "user@hbase.apache.org" <user@hbase.apache.org>
Content-Type: multipart/alternative; boundary=001a11c1c13434fb72050a5ab6ac

--001a11c1c13434fb72050a5ab6ac
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

bq. Then just fail over the applications to the 0.98 cluster

However, the application needs to compiled with 0.98 jars because the RPC
has changed.

Cheers

On Tue, Dec 16, 2014 at 11:42 AM, Esteban Gutierrez <esteban@cloudera.com>
wrote:
>
> +1 Andrew. Is not a simple task and is error prone and can cause dataloss
> if not performed correctly, also we don't have tooling to fix broken
> snapshots if moved manually.
>
> BTW 0.98 should migrate an old snapshot dir to the new post-namespaces
> directory hierarchy after starting HBase from a 0.94 layout. If the goal =
is
> to minimize downtime probably this should be a better approach: bootstrap
> the destination cluster with 0.94 with snapshots and replication enabled,
> then use the ExportSnapshot tool to copy the snapshots, import the
> snapshots and use replication on the remote cluster until the delta is
> minimal. Then stop the destination cluster and upgrade to 0.98 (that shou=
ld
> take care of migrating everything without user intervention). Once the
> destination cluster is migrated to 0.98 then use the replication bridge
> tool to catch up with the new edits and reduce the delta between both
> clusters. Then just fail over the applications to the 0.98 cluster. The
> repeat the upgrade process in the 0.94 cluster.
>
> my 2 cents,
> esteban.
>
>
>
>
>
>
> --
> Cloudera, Inc.
>
>
> On Tue, Dec 16, 2014 at 9:37 AM, Andrew Purtell <apurtell@apache.org>
> wrote:
> >
> > I disagree. Before adding something like that to the ref guide, we shou=
ld
> > actually agree to support it as a migration strategy. We're not there
> yet.
> > And although it's a heroic process, we can take steps to make it less
> > kludgy if so.
> >
> > On Tue, Dec 16, 2014 at 9:27 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> > >
> > > Good summary, Brian.
> > >
> > > This should be added to ref guide.
> > >
> > > Cheers
> > >
> > > On Tue, Dec 16, 2014 at 4:17 AM, Brian Jeltema <
> > > brian.jeltema@digitalenvoy.net> wrote:
> > > >
> > > > I have been able to export snapshots from 0.94 to 0.98. I=E2=80=99v=
e pasted
> the
> > > > instructions that I developed
> > > >
> > > > and published on our internal wiki. I also had to significantly
> > increase
> > > > retry count parameters due to
> > > >
> > > > a high number of timeout failures during the export.
> > > >
> > > >
> > > > Cross-cluster transfers
> > > >
> > > > To export a snapshot to a different cluster:
> > > >
> > > >    hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot
> > snappy
> > > > -copy-to proto://remhost/apps/hbase/data -mappers nummaps
> > > >
> > > > where snappy is the local snapshot to export, remhost is the host
> name
> > > the
> > > > default filesystem in the remote cluster which is the target of the
> > > export
> > > > (i.e. the value of fs.defaultFS), nummaps is the number of mappers =
to
> > run
> > > > to perform the export, and proto is the protocol to use, either hft=
p,
> > > hdfs
> > > > or webhdfs. Use hdfs if the clusters are compatible. Run this as th=
e
> > > hbase
> > > > user. If you see exceptions being thrown during the transfer relate=
d
> to
> > > > lease expirations, reduce the number of mappers or try using the
> > > -bandwidth
> > > > option. You may also see many "File does not exist" warnings in the
> log
> > > > output. These can be displayed continuously for several minutes for=
 a
> > > large
> > > > table, but appear to be noise and can be ignored, so be patient.
> > However,
> > > > it is also very common for this command to fail due to a variety of
> > file
> > > > ownership conflicts, so you may need to fiddle to get everything
> > right. A
> > > > failure of this command often leaves garbage on the target system
> that
> > > must
> > > > be deleted; if that is the case, the command will fail with info on
> > what
> > > > needs to be cleaned up.
> > > >
> > > > If exporting from an HBase 0.94 cluster to an HBase 0.98 cluster, y=
ou
> > > will
> > > > need to use the webhdfs protocol (or possibly hftp, though I couldn=
=E2=80=99t
> > get
> > > > that to work). You also need to manually move some files around
> because
> > > > snapshot layouts have changed. Based on the example above, on the
> 0.98
> > > > cluster do the following:
> > > >
> > > >    check whether any imports already exist for the table:
> > > >
> > > >   hadoop fs -ls /apps/hbase/data/archive/data/default
> > > >
> > > >   and check whether  snappyTable is listed, where snappyTable is th=
e
> > > > source of the snapshot (e.g. hosts). If the source table is listed,
> > then
> > > > merge the new ssnapshot data into the existing snapshot data:
> > > >
> > > >   hadoop fs -mv /apps/hbase/data/.archive/snappyTable/*
> > > > /apps/hbase/data/archive/data/default/snappyTable
> > > >
> > > >     hadoop fs -rm r /apps/hbase/data/.archive/snappyTable
> > > >
> > > > otherwise, create and populate the snapshot data directory:
> > > >
> > > >   hadoop fs -mv /apps/hbase/data/.archive/snappyTable
> > > > /apps/hbase/data/archive/data/default (snappyTable is the source of
> > > snappy)
> > > >
> > > > in either case, update the snapshot metadata files as follows:
> > > >
> > > >      hadoop fs -mkdir
> > /apps/hbase/data/.hbase-snapshot/snappy/.tabledesc
> > > >
> > > >   hadoop fs -mv
> > > > /apps/hbase/data/.hbase-snapshot/snappy/.tableinfo.0000000001
> > > > /apps/hbase/data/.hbase-snapshot/snappy/.tabledesc
> > > >
> > > > at this point, you should be able to do  a restore_snapshot from th=
e
> > > HBase
> > > > shell.
> > > >
> > > >
> > > > On Dec 15, 2014, at 8:52 PM, lars hofhansl <larsh@apache.org> wrote=
:
> > > >
> > > > > Nope :(Replication uses RPC and that was changed to protobufs.
> AFAIK
> > > > snapshots can also not be exported from 0.94 and 0.98. We have a
> really
> > > > shitty story here.      From: Sean Busbey <busbey@cloudera.com>
> > > > > To: user <user@hbase.apache.org>
> > > > > Sent: Monday, December 15, 2014 5:04 PM
> > > > > Subject: Re: 0.94 going forward
> > > > >
> > > > > Does replication and snapshot export work from 0.94.6+ to a 0.96 =
or
> > > 0.98
> > > > > cluster?
> > > > >
> > > > > Presuming it does, shouldn't a site be able to use a multiple
> cluster
> > > set
> > > > > up to do a cut over of a client application?
> > > > >
> > > > > That doesn't help with needing downtime for to do the eventual
> > upgrade,
> > > > but
> > > > > it mitigates the impact on the downstream app.
> > > > >
> > > > > --
> > > > > Sean
> > > > >
> > > > >
> > > > > On Dec 15, 2014 6:51 PM, "Jeremy Carroll" <phobos182@gmail.com>
> > wrote:
> > > > >
> > > > >> Which is why I feel that a lot of customers are still on 0.94.
> > Pretty
> > > > much
> > > > >> trapped unless you want to take downtime for your site. Any type
> of
> > > > >> guidance would be helpful. We are currently in the process of
> > > designing
> > > > our
> > > > >> own system to deal with this.
> > > > >>
> > > > >> On Mon, Dec 15, 2014 at 4:47 PM, Andrew Purtell <
> > apurtell@apache.org>
> > > > >> wrote:
> > > > >>>
> > > > >>> Zero downtime upgrade from 0.94 won't be possible. See
> > > > >>> http://hbase.apache.org/book.html#d0e5199
> > > > >>>
> > > > >>>
> > > > >>> On Mon, Dec 15, 2014 at 4:44 PM, Jeremy Carroll <
> > phobos182@gmail.com
> > > >
> > > > >>> wrote:
> > > > >>>>
> > > > >>>> Looking for guidance on how to do a zero downtime upgrade from
> > 0.94
> > > ->
> > > > >>> 0.98
> > > > >>>> (or 1.0 if it launches soon). As soon as we can figure this ou=
t,
> > we
> > > > >> will
> > > > >>>> migrate over.
> > > > >>>>
> > > > >>>> On Mon, Dec 15, 2014 at 1:37 PM, Esteban Gutierrez <
> > > > >> esteban@cloudera.com
> > > > >>>>
> > > > >>>> wrote:
> > > > >>>>>
> > > > >>>>> Hi Lars,
> > > > >>>>>
> > > > >>>>> Thanks for bringing this for discussion. From my experience I
> can
> > > > >> tell
> > > > >>>> that
> > > > >>>>> 0.94 is very stable but that shouldn't be a blocker to consid=
er
> > to
> > > > >>>> EOL'ing.
> > > > >>>>> Are you considering any specific timeframe for that?
> > > > >>>>>
> > > > >>>>> thanks,
> > > > >>>>> esteban.
> > > > >>>>>
> > > > >>>>> --
> > > > >>>>> Cloudera, Inc.
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Mon, Dec 15, 2014 at 11:46 AM, Koert Kuipers <
> > koert@tresata.com
> > > >
> > > > >>>> wrote:
> > > > >>>>>>
> > > > >>>>>> given that CDH4 is hbase 0.94 i dont believe nobody is using
> it.
> > > > >> for
> > > > >>>> our
> > > > >>>>>> clients the majority is on 0.94 (versus 0.96 and up).
> > > > >>>>>>
> > > > >>>>>> so i am going with 1), its very stable!
> > > > >>>>>>
> > > > >>>>>> On Mon, Dec 15, 2014 at 1:53 PM, lars hofhansl <
> > larsh@apache.org>
> > > > >>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>> Over the past few months the rate of the change into 0.94 h=
as
> > > > >>> slowed
> > > > >>>>>>> significantly.
> > > > >>>>>>> 0.94.25 was released on Nov 15th, and since then we had onl=
y
> 4
> > > > >>>> changes.
> > > > >>>>>>>
> > > > >>>>>>> This could mean two things: (1) 0.94 is very stable now or
> (2)
> > > > >>> nobody
> > > > >>>>> is
> > > > >>>>>>> using it (at least nobody is contributing to it anymore).
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> If anybody out there is still using 0.94 and is not plannin=
g
> to
> > > > >>>> upgrade
> > > > >>>>>> to
> > > > >>>>>>> 0.98 or later soon (which will required downtime), please
> speak
> > > > >> up.
> > > > >>>>>>> Otherwise it might be time to think about EOL'ing 0.94.
> > > > >>>>>>>
> > > > >>>>>>> It's not actually much work to do these releases, especiall=
y
> > when
> > > > >>>> they
> > > > >>>>>> are
> > > > >>>>>>> so small, but I'd like to continue only if they are actuall=
y
> > > > >> used.
> > > > >>>>>>> In any case, I am going to spin 0.94.26 with the current 4
> > fixes
> > > > >>>> today
> > > > >>>>> or
> > > > >>>>>>> tomorrow.
> > > > >>>>>>>
> > > > >>>>>>> -- Lars
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >>>
> > > > >>> --
> > > > >>> Best regards,
> > > > >>>
> > > > >>>     - Andy
> > > > >>>
> > > > >>> Problems worthy of attack prove their worth by hitting back. -
> Piet
> > > > Hein
> > > > >>> (via Tom White)
> > > > >>>
> > > > >>
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hei=
n
> > (via Tom White)
> >
>

--001a11c1c13434fb72050a5ab6ac--