hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912
Date Mon, 01 Aug 2016 19:03:36 GMT
Carter Shanklin posted a blog article about the feature:
Some use cases and examples of a command line interface usage.

https://hortonworks.com/blog/coming-hdp-2-5-incremental-backup-restore-apache-hbase-apache-phoenix/

-Vlad

On Wed, Jul 20, 2016 at 1:25 PM, Vladimir Rodionov <vladrodionov@gmail.com>
wrote:

> Ok, got it.
>
> -Vlad
>
> On Wed, Jul 20, 2016 at 12:15 PM, Enis Söztutar <enis@apache.org> wrote:
>
>> We keep the WALs which can accumulate a lot if the use case is to only do
>> backups infrequently. This will definitely cause issues since HDFS space
>> will get filled up. That is why we may need an option for having
>> incremental backups not used, and WAL references being deleted.
>>
>> Enis
>>
>> On Tue, Jul 19, 2016 at 6:33 PM, Vladimir Rodionov <
>> vladrodionov@gmail.com>
>> wrote:
>>
>> > Why anyone will ever need disabling incremental backups? If you do not
>> need
>> > it - just run only full backups.
>> >
>> > -Vlad
>> >
>> > On Tue, Jul 19, 2016 at 6:21 PM, Enis Söztutar <enis@apache.org> wrote:
>> >
>> > > Thanks Matteo for chiming in.
>> > >
>> > > On Tue, Jul 19, 2016 at 5:02 PM, Matteo Bertozzi <
>> > theo.bertozzi@gmail.com>
>> > > wrote:
>> > >
>> > > > I did some review in the early beginning, but then lost track of the
>> > > > changes.
>> > > > but I'd like to give a quick review to the full code once people
>> here
>> > are
>> > > > ok with getting this feature in master (2.0).
>> > > > (let say we put a deadline for reviews, like 1 week for reviewing
>> the
>> > > full
>> > > > stuff after everyone agrees to get this in. just to avoid holding
>> this
>> > > for
>> > > > too long, but still enough time to have people that are interested
>> to
>> > > look
>> > > > at it. with did the same thing for MOB with a mega patch
>> > > > https://reviews.apache.org/r/36391/)
>> > > >
>> > >
>> > > This sounds good. Vladimir / Ted how do you guys want to handle the
>> > merge?
>> > > As a giant patch or a rebase of code in the branch and through git
>> merge.
>> > >
>> > > We need to run a vote when the to-be-merged branched is ready. We can
>> > set a
>> > > vote timeout for at least 1 week.
>> > >
>> > >
>> > > >
>> > > > most of the code seemed isolated from the beginning, few changes
>> here
>> > and
>> > > > there in the core.
>> > > > so, this side of things seems ok to me.
>> > > >
>> > > > maybe some work to add IT tests as mentioned above, but that should
>> not
>> > > > take long.
>> > > >
>> > > > I don't know if there are already docs, but that is another thing
we
>> > may
>> > > > want to get in with the merge.
>> > > > a minimal coverage at least on how to use the feature, and maybe
>> > calling
>> > > it
>> > > > out as experimental?
>> > > >
>> > > > my main concern were around incremental backups.
>> > > > I'm still not convinced around the fact that because the WALs
>> contain
>> > > > regions of multiple tables
>> > > > the incremental backup will keep around WALs with some data that we
>> > don't
>> > > > really want in the backup (for space or maybe security reason).
>> > > >
>> > > > then there was the question about for how long should I take
>> > > incrementals,
>> > > > before deciding that a fresh full backup is less costly in terms of
>> > > space.
>> > > > but I think this incremental merge/compaction was a feature on the
>> > > roadmap
>> > > > as Phase3.
>> > > > which I think is ok to get later on,
>> > > > maybe just call out a lifecycle example on the docs under "best
>> > > practices".
>> > > >
>> > >
>> > > I think this will depend on the use case, and other factors like
>> > bandwidth
>> > > available, how much data
>> > > the user is willing to lose in case of catastrophic failure and how
>> > > "expensive" is full backup versus
>> > > incremental one.
>> > >
>> > > The full backup should also be useable by default, so maybe we can
>> make
>> > an
>> > > option to not even keep WAL files, and completely disable incremental
>> > > backups?
>> > >
>> > > Enis
>> > >
>> > >
>> > > >
>> > > > has anyone interested in using backups looked at the doc in
>> HBASE-7912?
>> > > > is the current design of incremental backup acceptable for everyone
>> > > wanting
>> > > > to use this feature?
>> > > > (maybe this should be a question for the @user list and not dev)
>> > > >
>> > > > is there anyone already using this feature or it is just dev testing
>> > it?
>> > > > to me will be interesting having a use-case/workflow example,
>> > > > to see if in the real world my concerns about incremental are not
>> > showing
>> > > > up.
>> > > >
>> > > > On Tue, Jul 19, 2016 at 1:35 PM, Ted Yu <yuzhihong@gmail.com>
>> wrote:
>> > > >
>> > > > > Gentle ping on this subject.
>> > > > >
>> > > > > The changes are mostly non-intrusive.
>> > > > >
>> > > > > More comments are welcome.
>> > > > >
>> > > > > On Mon, Jul 11, 2016 at 9:29 PM, Vladimir Rodionov <
>> > > > vladrodionov@gmail.com
>> > > > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Not that hard, Andrew. I will open JIRA.
>> > > > > >
>> > > > > > -Vlad
>> > > > > >
>> > > > > > On Mon, Jul 11, 2016 at 8:46 PM, Andrew Purtell <
>> > > > > andrew.purtell@gmail.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > How hard would it be to convert what you've been using
to test
>> > end
>> > > to
>> > > > > end
>> > > > > > > during dev into an IT?
>> > > > > > >
>> > > > > > >
>> > > > > > > On Jul 11, 2016, at 5:31 PM, Vladimir Rodionov <
>> > > > vladrodionov@gmail.com
>> > > > > >
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > >>> Is there an integration test in hbase-it
yet? If not, any
>> > tips
>> > > > on a
>> > > > > > > >>> semi-automateable way to take backups
and restore them?
>> > > > > > > >
>> > > > > > > > We do not have yet, but we have a lot of unit
tests. We
>> > provide 2
>> > > > API
>> > > > > > for
>> > > > > > > > backup:
>> > > > > > > >
>> > > > > > > > 1. Admin.getBackupAdmin
>> > > > > > > >
>> > > > > > > > 2. Command - line via hbase command.
>> > > > > > > >
>> > > > > > > > Everything is straightforward.
>> > > > > > > >
>> > > > > > > > -Vlad
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >> On Mon, Jul 11, 2016 at 5:23 PM, Dima Spivak
<
>> > > > dspivak@cloudera.com>
>> > > > > > > wrote:
>> > > > > > > >>
>> > > > > > > >> Is there an integration test in hbase-it yet?
If not, any
>> tips
>> > > on
>> > > > a
>> > > > > > > >> semi-automateable way to take backups and
restore them?
>> > > > > > > >>
>> > > > > > > >> -Dima
>> > > > > > > >>
>> > > > > > > >> On Mon, Jul 11, 2016 at 6:42 PM, Vladimir
Rodionov <
>> > > > > > > vladrodionov@gmail.com
>> > > > > > > >> wrote:
>> > > > > > > >>
>> > > > > > > >>> Sorry, wrong links:
>> > > > > > > >>> These are the phases:
>> > > > > > > >>>
>> > > > > > > >>> Phase 1:
>> > > > > > > >>> https://issues.apache.org/jira/browse/HBASE-
>> > > > > > > >>> <https://issues.apache.org/jira/browse/HBASE-14030>14030
>> > > > > > > >>> Phase 2:
>> > > > > > > >>> https://issues.apache.org/jira/browse/HBASE-
>> > > > > > > >>> <https://issues.apache.org/jira/browse/HBASE-14123>14123
>> > > > > > > >>> Phase 3:
>> > > > > > > >>> https://issues.apache.org/jira/browse/HBASE-
>> > > > > > > >>> <https://issues.apache.org/jira/browse/HBASE-14414>14414
>> > > > > > > >>>
>> > > > > > > >>> -Vlad
>> > > > > > > >>>
>> > > > > > > >>> On Mon, Jul 11, 2016 at 4:41 PM, Vladimir
Rodionov <
>> > > > > > > >> vladrodionov@gmail.com
>> > > > > > > >>> wrote:
>> > > > > > > >>>
>> > > > > > > >>>> These are the phases:
>> > > > > > > >>>>
>> > > > > > > >>>> Phase 1:
>> > > > > > > >>>> https://issues.apache.org/jira/browse/HBASE-
>> > > > > > > >>>> <https://issues.apache.org/jira/browse/HBASE-7912>14030
>> > > > > > > >>>> Phase 2:
>> > > > > > > >>>> https://issues.apache.org/jira/browse/HBASE-
>> > > > > > > >>>> <https://issues.apache.org/jira/browse/HBASE-7912>14123
>> > > > > > > >>>> Phase 3:
>> > > > > > > >>>> https://issues.apache.org/jira/browse/HBASE-
>> > > > > > > >>>> <https://issues.apache.org/jira/browse/HBASE-7912>14414
>> > > > > > > >>>>
>> > > > > > > >>>> -Vlad
>> > > > > > > >>>>
>> > > > > > > >>>>
>> > > > > > > >>>> On Mon, Jul 11, 2016 at 12:21 PM,
Enis Söztutar <
>> > > > enis@apache.org>
>> > > > > > > >> wrote:
>> > > > > > > >>>>
>> > > > > > > >>>>> As you guys may already be familiar,
Vladimir, Ted,
>> Jerry
>> > and
>> > > > > > others
>> > > > > > > >>> have
>> > > > > > > >>>>> been developing the backup / restore
functionality in a
>> > > series
>> > > > of
>> > > > > > > >> issues
>> > > > > > > >>>>> committed in the separate branch
HBASE-7912[1].
>> > > > > > > >>>>>
>> > > > > > > >>>>> Backup / Restore functionality
is tracked as a 4-phase
>> > > project,
>> > > > > and
>> > > > > > > >> the
>> > > > > > > >>>>> first two phases are complete
and useable. We are now
>> > working
>> > > > on
>> > > > > > > >> Phase 3
>> > > > > > > >>>>> items, which are mostly improvements.
We think that the
>> > > current
>> > > > > > code
>> > > > > > > >> in
>> > > > > > > >>>>> the
>> > > > > > > >>>>> branch containing all Phase 1
and Phase 2 items, and
>> some
>> > > > Phase 3
>> > > > > > > >> items
>> > > > > > > >>> is
>> > > > > > > >>>>> useable on it's own, and we do
not have to wait for all
>> the
>> > > > > > > subtickets
>> > > > > > > >>> to
>> > > > > > > >>>>> be finished to make it completely
useable (as follow up
>> > > tickets
>> > > > > are
>> > > > > > > >>> mostly
>> > > > > > > >>>>> improvements or optimizations).
The improvements in the
>> > works
>> > > > are
>> > > > > > all
>> > > > > > > >>>>> backwards compatible with the
existing stuff. Thus, we
>> > would
>> > > > like
>> > > > > > to
>> > > > > > > >>>>> propose that the branch HBASE-7912
be merged into
>> master.
>> > > The
>> > > > > > parent
>> > > > > > > >>> jira
>> > > > > > > >>>>> has a design doc that goes into
details about the
>> > > > implementation
>> > > > > > and
>> > > > > > > >>>>> design
>> > > > > > > >>>>> choices in case you are interested[2].
>> > > > > > > >>>>>
>> > > > > > > >>>>> Most of the changes are largely
non-intrusive and
>> confined
>> > to
>> > > > the
>> > > > > > > >>>>> backup subsystem.
>> > > > > > > >>>>> The unit tests have been passing
on manual runs and we
>> > > > > > (hortonworks)
>> > > > > > > >>> have
>> > > > > > > >>>>> been running the integration tests
as well as some other
>> > > > > > shell-based
>> > > > > > > >>>>> system
>> > > > > > > >>>>> tests on a forked version of the
code. Most of the work
>> has
>> > > > been
>> > > > > > > >>> reviewed
>> > > > > > > >>>>> by 1, 2 or 3 committers already
(mostly Ted, myself and
>> > > Jerry).
>> > > > > > > >>>>>
>> > > > > > > >>>>> What do you guys think? Is it
time to call a vote? Any
>> > > concerns
>> > > > > or
>> > > > > > > >>>>> feedback
>> > > > > > > >>>>> appreciated.
>> > > > > > > >>>>>
>> > > > > > > >>>>> [1] https://issues.apache.org/jira/browse/HBASE-7912
>> > > > > > > >>>>> [2]
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://issues.apache.org/jira/secure/attachment/12816339/HBaseBackupAndRestore%20-0.91.pdf
>> > > > > > > >>>>>
>> > > > > > > >>>>> Enis
>> > > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message