hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912
Date Sat, 20 Aug 2016 17:38:46 GMT
Thanks, Andrew.

Planning to commit the IT test Monday.

On Sat, Aug 20, 2016 at 10:29 AM, Andrew Purtell <andrew.purtell@gmail.com>
wrote:

> Let's commit the IT to the branch, if you think the v5 patch is ready for
> commit Ted.
>
> I will be able to spend some time next week trying out the branch via the
> IT, and poking around with the new tools. After that I feel like I'll be
> informed enough to vote on a branch merge vote.
>
> > On Aug 19, 2016, at 12:38 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > IT test is provided on HBASE-16255.
> >
> > Any other comment ?
> >
> > Thanks
> >
> >> On Tue, Aug 2, 2016 at 9:09 PM, Dima Spivak <dspivak@cloudera.com>
> wrote:
> >>
> >> Any chance for an IT test being added to the branch first? I'd love to
> put
> >> it through the paces with clusterdock to make sure it behaves well with
> >> fault injection and the like.
> >>
> >> -Dima
> >>
> >>> On Tuesday, August 2, 2016, Ted Yu <yuzhihong@gmail.com> wrote:
> >>>
> >>> Any more comments from the community on whether the merge can be
> >> conducted
> >>> ?
> >>>
> >>> Thanks
> >>>
> >>> On Mon, Aug 1, 2016 at 12:03 PM, Vladimir Rodionov <
> >> vladrodionov@gmail.com
> >>> <javascript:;>>
> >>> wrote:
> >>>
> >>>> Carter Shanklin posted a blog article about the feature:
> >>>> Some use cases and examples of a command line interface usage.
> >>> https://hortonworks.com/blog/coming-hdp-2-5-incremental-
> >> backup-restore-apache-hbase-apache-phoenix/
> >>>>
> >>>> -Vlad
> >>>>
> >>>> On Wed, Jul 20, 2016 at 1:25 PM, Vladimir Rodionov <
> >>> vladrodionov@gmail.com <javascript:;>
> >>>> wrote:
> >>>>
> >>>>> Ok, got it.
> >>>>>
> >>>>> -Vlad
> >>>>>
> >>>>> On Wed, Jul 20, 2016 at 12:15 PM, Enis Söztutar <enis@apache.org
> >>> <javascript:;>> wrote:
> >>>>>
> >>>>>> We keep the WALs which can accumulate a lot if the use case
is to
> >> only
> >>>> do
> >>>>>> backups infrequently. This will definitely cause issues since
HDFS
> >>> space
> >>>>>> will get filled up. That is why we may need an option for having
> >>>>>> incremental backups not used, and WAL references being deleted.
> >>>>>>
> >>>>>> Enis
> >>>>>>
> >>>>>> On Tue, Jul 19, 2016 at 6:33 PM, Vladimir Rodionov <
> >>>>>> vladrodionov@gmail.com <javascript:;>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Why anyone will ever need disabling incremental backups?
If you do
> >>> not
> >>>>>> need
> >>>>>>> it - just run only full backups.
> >>>>>>>
> >>>>>>> -Vlad
> >>>>>>>
> >>>>>>> On Tue, Jul 19, 2016 at 6:21 PM, Enis Söztutar <enis@apache.org
> >>> <javascript:;>>
> >>>> wrote:
> >>>>>>>
> >>>>>>>> Thanks Matteo for chiming in.
> >>>>>>>>
> >>>>>>>> On Tue, Jul 19, 2016 at 5:02 PM, Matteo Bertozzi <
> >>>>>>> theo.bertozzi@gmail.com <javascript:;>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> I did some review in the early beginning, but then
lost track
> >> of
> >>>> the
> >>>>>>>>> changes.
> >>>>>>>>> but I'd like to give a quick review to the full
code once
> >> people
> >>>>>> here
> >>>>>>> are
> >>>>>>>>> ok with getting this feature in master (2.0).
> >>>>>>>>> (let say we put a deadline for reviews, like 1 week
for
> >>> reviewing
> >>>>>> the
> >>>>>>>> full
> >>>>>>>>> stuff after everyone agrees to get this in. just
to avoid
> >>> holding
> >>>>>> this
> >>>>>>>> for
> >>>>>>>>> too long, but still enough time to have people that
are
> >>> interested
> >>>>>> to
> >>>>>>>> look
> >>>>>>>>> at it. with did the same thing for MOB with a mega
patch
> >>>>>>>>> https://reviews.apache.org/r/36391/)
> >>>>>>>>
> >>>>>>>> This sounds good. Vladimir / Ted how do you guys want
to handle
> >>> the
> >>>>>>> merge?
> >>>>>>>> As a giant patch or a rebase of code in the branch and
through
> >> git
> >>>>>> merge.
> >>>>>>>>
> >>>>>>>> We need to run a vote when the to-be-merged branched
is ready.
> >> We
> >>>> can
> >>>>>>> set a
> >>>>>>>> vote timeout for at least 1 week.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> most of the code seemed isolated from the beginning,
few
> >> changes
> >>>>>> here
> >>>>>>> and
> >>>>>>>>> there in the core.
> >>>>>>>>> so, this side of things seems ok to me.
> >>>>>>>>>
> >>>>>>>>> maybe some work to add IT tests as mentioned above,
but that
> >>>> should
> >>>>>> not
> >>>>>>>>> take long.
> >>>>>>>>>
> >>>>>>>>> I don't know if there are already docs, but that
is another
> >>> thing
> >>>> we
> >>>>>>> may
> >>>>>>>>> want to get in with the merge.
> >>>>>>>>> a minimal coverage at least on how to use the feature,
and
> >> maybe
> >>>>>>> calling
> >>>>>>>> it
> >>>>>>>>> out as experimental?
> >>>>>>>>>
> >>>>>>>>> my main concern were around incremental backups.
> >>>>>>>>> I'm still not convinced around the fact that because
the WALs
> >>>>>> contain
> >>>>>>>>> regions of multiple tables
> >>>>>>>>> the incremental backup will keep around WALs with
some data
> >> that
> >>>> we
> >>>>>>> don't
> >>>>>>>>> really want in the backup (for space or maybe security
> >> reason).
> >>>>>>>>>
> >>>>>>>>> then there was the question about for how long should
I take
> >>>>>>>> incrementals,
> >>>>>>>>> before deciding that a fresh full backup is less
costly in
> >> terms
> >>>> of
> >>>>>>>> space.
> >>>>>>>>> but I think this incremental merge/compaction was
a feature on
> >>> the
> >>>>>>>> roadmap
> >>>>>>>>> as Phase3.
> >>>>>>>>> which I think is ok to get later on,
> >>>>>>>>> maybe just call out a lifecycle example on the docs
under
> >> "best
> >>>>>>>> practices".
> >>>>>>>>
> >>>>>>>> I think this will depend on the use case, and other
factors like
> >>>>>>> bandwidth
> >>>>>>>> available, how much data
> >>>>>>>> the user is willing to lose in case of catastrophic
failure and
> >>> how
> >>>>>>>> "expensive" is full backup versus
> >>>>>>>> incremental one.
> >>>>>>>>
> >>>>>>>> The full backup should also be useable by default, so
maybe we
> >> can
> >>>>>> make
> >>>>>>> an
> >>>>>>>> option to not even keep WAL files, and completely disable
> >>>> incremental
> >>>>>>>> backups?
> >>>>>>>>
> >>>>>>>> Enis
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> has anyone interested in using backups looked at
the doc in
> >>>>>> HBASE-7912?
> >>>>>>>>> is the current design of incremental backup acceptable
for
> >>>> everyone
> >>>>>>>> wanting
> >>>>>>>>> to use this feature?
> >>>>>>>>> (maybe this should be a question for the @user list
and not
> >> dev)
> >>>>>>>>>
> >>>>>>>>> is there anyone already using this feature or it
is just dev
> >>>> testing
> >>>>>>> it?
> >>>>>>>>> to me will be interesting having a use-case/workflow
example,
> >>>>>>>>> to see if in the real world my concerns about incremental
are
> >>> not
> >>>>>>> showing
> >>>>>>>>> up.
> >>>>>>>>>
> >>>>>>>>> On Tue, Jul 19, 2016 at 1:35 PM, Ted Yu <yuzhihong@gmail.com
> >>> <javascript:;>>
> >>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Gentle ping on this subject.
> >>>>>>>>>>
> >>>>>>>>>> The changes are mostly non-intrusive.
> >>>>>>>>>>
> >>>>>>>>>> More comments are welcome.
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Jul 11, 2016 at 9:29 PM, Vladimir Rodionov
<
> >>>>>>>>> vladrodionov@gmail.com <javascript:;>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Not that hard, Andrew. I will open JIRA.
> >>>>>>>>>>>
> >>>>>>>>>>> -Vlad
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Jul 11, 2016 at 8:46 PM, Andrew
Purtell <
> >>>>>>>>>> andrew.purtell@gmail.com <javascript:;>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> How hard would it be to convert what
you've been using
> >> to
> >>>> test
> >>>>>>> end
> >>>>>>>> to
> >>>>>>>>>> end
> >>>>>>>>>>>> during dev into an IT?
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Jul 11, 2016, at 5:31 PM, Vladimir
Rodionov <
> >>>>>>>>> vladrodionov@gmail.com <javascript:;>
> >>>>>>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>>>> Is there an integration
test in hbase-it yet? If
> >> not,
> >>>> any
> >>>>>>> tips
> >>>>>>>>> on a
> >>>>>>>>>>>>>>> semi-automateable way to
take backups and restore
> >>> them?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> We do not have yet, but we have
a lot of unit tests.
> >> We
> >>>>>>> provide 2
> >>>>>>>>> API
> >>>>>>>>>>> for
> >>>>>>>>>>>>> backup:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 1. Admin.getBackupAdmin
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 2. Command - line via hbase command.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Everything is straightforward.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> -Vlad
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Mon, Jul 11, 2016 at 5:23
PM, Dima Spivak <
> >>>>>>>>> dspivak@cloudera.com <javascript:;>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Is there an integration test
in hbase-it yet? If not,
> >>> any
> >>>>>> tips
> >>>>>>>> on
> >>>>>>>>> a
> >>>>>>>>>>>>>> semi-automateable way to take
backups and restore
> >> them?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> -Dima
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Mon, Jul 11, 2016 at 6:42
PM, Vladimir Rodionov <
> >>>>>>>>>>>> vladrodionov@gmail.com <javascript:;>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Sorry, wrong links:
> >>>>>>>>>>>>>>> These are the phases:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Phase 1:
> >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/HBASE-
> >>>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-14030
> >>>>> 14030
> >>>>>>>>>>>>>>> Phase 2:
> >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/HBASE-
> >>>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-14123
> >>>>> 14123
> >>>>>>>>>>>>>>> Phase 3:
> >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/HBASE-
> >>>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-14414
> >>>>> 14414
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -Vlad
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Mon, Jul 11, 2016 at
4:41 PM, Vladimir Rodionov <
> >>>>>>>>>>>>>> vladrodionov@gmail.com <javascript:;>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> These are the phases:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Phase 1:
> >>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/HBASE-
> >>>>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-7912
> >>>>> 14030
> >>>>>>>>>>>>>>>> Phase 2:
> >>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/HBASE-
> >>>>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-7912
> >>>>> 14123
> >>>>>>>>>>>>>>>> Phase 3:
> >>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/HBASE-
> >>>>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-7912
> >>>>> 14414
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> -Vlad
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Mon, Jul 11, 2016
at 12:21 PM, Enis Söztutar <
> >>>>>>>>> enis@apache.org <javascript:;>>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> As you guys may
already be familiar, Vladimir,
> >> Ted,
> >>>>>> Jerry
> >>>>>>> and
> >>>>>>>>>>> others
> >>>>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>>> been developing
the backup / restore functionality
> >>> in
> >>>> a
> >>>>>>>> series
> >>>>>>>>> of
> >>>>>>>>>>>>>> issues
> >>>>>>>>>>>>>>>>> committed in the
separate branch HBASE-7912[1].
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Backup / Restore
functionality is tracked as a
> >>> 4-phase
> >>>>>>>> project,
> >>>>>>>>>> and
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> first two phases
are complete and useable. We are
> >>> now
> >>>>>>> working
> >>>>>>>>> on
> >>>>>>>>>>>>>> Phase 3
> >>>>>>>>>>>>>>>>> items, which are
mostly improvements. We think
> >> that
> >>>> the
> >>>>>>>> current
> >>>>>>>>>>> code
> >>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> branch containing
all Phase 1 and Phase 2 items,
> >> and
> >>>>>> some
> >>>>>>>>> Phase 3
> >>>>>>>>>>>>>> items
> >>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>> useable on it's
own, and we do not have to wait
> >> for
> >>>> all
> >>>>>> the
> >>>>>>>>>>>> subtickets
> >>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> be finished to make
it completely useable (as
> >> follow
> >>>> up
> >>>>>>>> tickets
> >>>>>>>>>> are
> >>>>>>>>>>>>>>> mostly
> >>>>>>>>>>>>>>>>> improvements or
optimizations). The improvements
> >> in
> >>>> the
> >>>>>>> works
> >>>>>>>>> are
> >>>>>>>>>>> all
> >>>>>>>>>>>>>>>>> backwards compatible
with the existing stuff.
> >> Thus,
> >>> we
> >>>>>>> would
> >>>>>>>>> like
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> propose that the
branch HBASE-7912 be merged into
> >>>>>> master.
> >>>>>>>> The
> >>>>>>>>>>> parent
> >>>>>>>>>>>>>>> jira
> >>>>>>>>>>>>>>>>> has a design doc
that goes into details about the
> >>>>>>>>> implementation
> >>>>>>>>>>> and
> >>>>>>>>>>>>>>>>> design
> >>>>>>>>>>>>>>>>> choices in case
you are interested[2].
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Most of the changes
are largely non-intrusive and
> >>>>>> confined
> >>>>>>> to
> >>>>>>>>> the
> >>>>>>>>>>>>>>>>> backup subsystem.
> >>>>>>>>>>>>>>>>> The unit tests have
been passing on manual runs
> >> and
> >>> we
> >>>>>>>>>>> (hortonworks)
> >>>>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>>> been running the
integration tests as well as some
> >>>> other
> >>>>>>>>>>> shell-based
> >>>>>>>>>>>>>>>>> system
> >>>>>>>>>>>>>>>>> tests on a forked
version of the code. Most of the
> >>>> work
> >>>>>> has
> >>>>>>>>> been
> >>>>>>>>>>>>>>> reviewed
> >>>>>>>>>>>>>>>>> by 1, 2 or 3 committers
already (mostly Ted,
> >> myself
> >>>> and
> >>>>>>>> Jerry).
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> What do you guys
think? Is it time to call a vote?
> >>> Any
> >>>>>>>> concerns
> >>>>>>>>>> or
> >>>>>>>>>>>>>>>>> feedback
> >>>>>>>>>>>>>>>>> appreciated.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> [1]
> >>> https://issues.apache.org/jira/browse/HBASE-7912
> >>>>>>>>>>>>>>>>> [2]
> >>> https://issues.apache.org/jira/secure/attachment/12816339/
> >> HBaseBackupAndRestore%20-0.91.pdf
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Enis
> >>
> >>
> >> --
> >> -Dima
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message