hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matteo Bertozzi <theo.berto...@gmail.com>
Subject Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912
Date Fri, 09 Sep 2016 20:02:47 GMT
we should probably have a "current limitations" section in the user guide
(maybe near the technical details),
some of this stuff may be in the final 2.0 since some tasks are marked as
phase3,
but I think is important to mention stuff like:
 - if you write to the table with Durability.SKIP_WALS your data will not
be in the incremental-backup
 - if you bulkload files that data will not be in the incremental backup
(HBASE-14417)
 - the incremental backup will not only contains the data of the table you
specified but also the regions from other tables that are on the same set
of RSs (HBASE-14141) ...maybe a note about security around this topic
 - the incremental backup will not contains just the "latest row" between
backup A and B, but it will also contains all the updates occurred in
between. but the restore does not allow you to restore up to a certain
point in time, the restore will always be up to the "latest backup point".
 - you should limit the number of "incremental" up to N (or maybe SIZE), to
avoid replay time becoming the bottleneck. (HBASE-14135)


On Fri, Sep 9, 2016 at 12:25 PM, Vladimir Rodionov <vladrodionov@gmail.com>
wrote:

> User Guide, prepared by our tech writer Frank Welsh, was attached to
> HBASE-7912.
>
> -Vlad
>
> On Fri, Sep 9, 2016 at 12:16 PM, Vladimir Rodionov <vladrodionov@gmail.com
> >
> wrote:
>
> > Do not worry Sean, doc is coming today as a preview and our writer Frank
> > will be working on a putting  it into Apache repo. Timeline depends on
> > Franks schedule but I hope we will get it rather sooner than later.
> >
> > As for failure testing, we are focusing only on a consistent state of
> > backup system data in a presence of any type of failures, We are not
> going
> > to implement  anything more "fancy", than that. We allow both: backup and
> > restore to fail. What we do not allow is to have system data corrupted.
> > Will it suffice for you? Do you have any other concerns, you want us to
> > address?
> >
> > -Vlad
> >
> >
> > On Fri, Sep 9, 2016 at 10:56 AM, Sean Busbey <busbey@apache.org> wrote:
> >
> >> "docs will come to Apache soon" does not address my concern around docs
> at
> >> all, unless said docs have already made it into the project repo. I
> don't
> >> want third party resources for using a major and important feature of
> the
> >> project, I want us to provide end users with what they need to get the
> job
> >> done.
> >>
> >> I see some calls for patience on the failure testing, but the appeal to
> us
> >> having done a bad job of requiring proper tests of previous features
> just
> >> makes me more concerned about not getting them here. I don't want to set
> >> yet another bad example that will then be pointed to in the future.
> >>
> >> On Sep 8, 2016 10:50, "Ted Yu" <yuzhihong@gmail.com> wrote:
> >>
> >> > Is there any concern which is not addressed ?
> >> >
> >> > Do we need another Vote thread ?
> >> >
> >> > Thanks
> >> >
> >> > On Thu, Sep 8, 2016 at 9:21 AM, Andrew Purtell <apurtell@apache.org>
> >> > wrote:
> >> >
> >> > > Vlad,
> >> > >
> >> > > I apologize for using the term 'half-baked' in a way that could
> seem a
> >> > > description of HBASE-7912. I meant that as a general hypothetical.
> >> > >
> >> > > On Wed, Sep 7, 2016 at 9:36 AM, Vladimir Rodionov <
> >> > vladrodionov@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > >> I'm not sure that "There is already lots of half-baked
code in
> >> the
> >> > > > branch,
> >> > > > so what's the harm in adding more?"
> >> > > >
> >> > > > I meant - not production - ready yet. This is 2.0 development
> branch
> >> > and,
> >> > > > hence many features are in works,
> >> > > > not being tested well etc. I do not consider backup as half baked
> >> > > feature -
> >> > > > it has passed our internal QA and has very good doc, which we
will
> >> > > provide
> >> > > > to Apache shortly.
> >> > > >
> >> > > > -Vlad
> >> > > >
> >> > > > On Wed, Sep 7, 2016 at 9:13 AM, Andrew Purtell <
> apurtell@apache.org
> >> >
> >> > > > wrote:
> >> > > >
> >> > > > > We shouldn't admit half baked changes that won't be finished.
> >> However
> >> > > in
> >> > > > > this case the crew working on this feature are long timers
and
> >> less
> >> > > > likely
> >> > > > > than just about anyone to leave something in a half baked
state.
> >> Of
> >> > > > course
> >> > > > > there is no guarantee how anything will turn out, but I
am
> >> willing to
> >> > > > take
> >> > > > > a little on faith if they feel their best path forward now
is to
> >> > merge
> >> > > to
> >> > > > > trunk. I only wish I had bandwidth to have done some real
> kicking
> >> of
> >> > > the
> >> > > > > tires by now. Maybe this week.
> >> > > > >
> >> > > > > (Yes, I'm using some of that time for this email :-) but
I type
> >> > fast.)
> >> > > > >
> >> > > > > That said, I would like to agitate for making 2.0 more real
and
> >> spend
> >> > > > some
> >> > > > > time on it now that I'm winding down with 0.98. I think
that
> means
> >> > > > > branching for 2.0 real soon now and even evicting things
from
> 2.0
> >> > > branch
> >> > > > > that aren't finished or stable, leaving them only once again
in
> >> the
> >> > > > master
> >> > > > > branch. Or, maybe just evicting them. Let's take it case
by
> case.
> >> > > > >
> >> > > > > I think this feature can come in relatively safely. As added
> >> > insurance,
> >> > > > > let's admit the possibility it could be reverted on the
2.0
> >> branch if
> >> > > > folks
> >> > > > > working on stabilizing 2.0 decide to evict it because it
is
> >> > unfinished
> >> > > or
> >> > > > > unstable, because that certainly can happen. I would expect
if
> >> talk
> >> > > like
> >> > > > > that starts, we'd get help finishing or stabilizing what's
under
> >> > > > discussion
> >> > > > > for revert. Or, we'd have a revert. Either way the outcome
is
> >> > > acceptable.
> >> > > > >
> >> > > > >
> >> > > > > On Wed, Sep 7, 2016 at 8:56 AM, Dima Spivak <
> >> dimaspivak@apache.org>
> >> > > > wrote:
> >> > > > >
> >> > > > > > I'm not sure that "There is already lots of half-baked
code in
> >> the
> >> > > > > branch,
> >> > > > > > so what's the harm in adding more?" is a good code
commit
> >> > philosophy
> >> > > > for
> >> > > > > a
> >> > > > > > fault-tolerant distributed data store. ;)
> >> > > > > >
> >> > > > > > More seriously, a lack of test coverage for existing
features
> >> > > shouldn't
> >> > > > > be
> >> > > > > > used as justification for introducing new features
with the
> same
> >> > > > > > shortcomings. Ultimately, it's the end user who will
feel the
> >> pain,
> >> > > so
> >> > > > > > shouldn't we do everything we can to mitigate that?
> >> > > > > >
> >> > > > > > -Dima
> >> > > > > >
> >> > > > > > On Wed, Sep 7, 2016 at 8:46 AM, Vladimir Rodionov <
> >> > > > > vladrodionov@gmail.com>
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > Sean,
> >> > > > > > >
> >> > > > > > > * have docs
> >> > > > > > >
> >> > > > > > > Agree. We have a doc and backup is the most documented
> feature
> >> > :),
> >> > > we
> >> > > > > > will
> >> > > > > > > release it shortly to Apache.
> >> > > > > > >
> >> > > > > > > * have sunny-day correctness tests
> >> > > > > > >
> >> > > > > > > Feature has  close to 60 test cases, which run
for approx 30
> >> min.
> >> > > We
> >> > > > > can
> >> > > > > > > add more, if community do not mind :)
> >> > > > > > >
> >> > > > > > > * have correctness-in-face-of-failure tests
> >> > > > > > >
> >> > > > > > > Any examples of these tests in existing features?
In works,
> we
> >> > > have a
> >> > > > > > clear
> >> > > > > > > understanding of what should be done by the time
of 2.0
> >> release.
> >> > > > > > > That is very close goal for us, to verify IT monkey
for
> >> existing
> >> > > > code.
> >> > > > > > >
> >> > > > > > > * don't rely on things outside of HBase for normal
operation
> >> > (okay
> >> > > > for
> >> > > > > > > advanced operation)
> >> > > > > > >
> >> > > > > > > We do not.
> >> > > > > > >
> >> > > > > > > Enormous time has been spent already on the development
and
> >> > testing
> >> > > > the
> >> > > > > > > feature, it has passed our internal tests and
many rounds of
> >> code
> >> > > > > reviews
> >> > > > > > > by HBase committers. We do not mind if someone
from HBase
> >> > community
> >> > > > > > > (outside of HW) will review the code, but it will
probably
> >> takes
> >> > > > > forever
> >> > > > > > to
> >> > > > > > > wait for volunteer?, the feature is quite large
(1MB+
> >> cumulative
> >> > > > patch)
> >> > > > > > >
> >> > > > > > > 2.0 branch is full of half baked features, most
of them are
> in
> >> > > active
> >> > > > > > > development, therefore I am not following you
here, Sean?
> Why
> >> > > > > HBASE-7912
> >> > > > > > is
> >> > > > > > > not good enough yet to be integrated into 2.0
branch?
> >> > > > > > >
> >> > > > > > > -Vlad
> >> > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On Wed, Sep 7, 2016 at 8:23 AM, Sean Busbey <
> >> busbey@apache.org>
> >> > > > wrote:
> >> > > > > > >
> >> > > > > > > > On Tue, Sep 6, 2016 at 10:36 PM, Josh Elser
<
> >> > > josh.elser@gmail.com>
> >> > > > > > > wrote:
> >> > > > > > > > > So, the answer to Sean's original question
is "as robust
> >> as
> >> > > > > snapshots
> >> > > > > > > > > presently are"? (independence of backup/restore
failure
> >> > > tolerance
> >> > > > > > from
> >> > > > > > > > > snapshot failure tolerance)
> >> > > > > > > > >
> >> > > > > > > > > Is this just a question WRT context
of the change, or is
> >> it
> >> > > means
> >> > > > > > for a
> >> > > > > > > > veto
> >> > > > > > > > > from you, Sean? Just trying to make
sure I'm following
> >> along
> >> > > > > > > adequately.
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > > > I'd say ATM I'm -0, bordering on -1 but not
for reasons I
> >> can
> >> > > > > > articulate
> >> > > > > > > > well.
> >> > > > > > > >
> >> > > > > > > > Here's an attempt.
> >> > > > > > > >
> >> > > > > > > > We've been trying to move, as a community,
towards
> >> minimizing
> >> > > risk
> >> > > > to
> >> > > > > > > > downstream folks by getting "complete enough
for use"
> gates
> >> in
> >> > > > place
> >> > > > > > > > before we introduce new features. This was
spurred by a
> some
> >> > > > features
> >> > > > > > > > getting in half-baked and never making it
to "can really
> >> use"
> >> > > > status
> >> > > > > > > > (I'm thinking of distributed log replay and
the zk-less
> >> > > assignment
> >> > > > > > > > stuff, I don't recall if there was more).
> >> > > > > > > >
> >> > > > > > > > The gates, generally, included things like:
> >> > > > > > > >
> >> > > > > > > > * have docs
> >> > > > > > > > * have sunny-day correctness tests
> >> > > > > > > > * have correctness-in-face-of-failure tests
> >> > > > > > > > * don't rely on things outside of HBase for
normal
> operation
> >> > > (okay
> >> > > > > for
> >> > > > > > > > advanced operation)
> >> > > > > > > >
> >> > > > > > > > As an example, we kept the MOB work off in
a branch and
> out
> >> of
> >> > > > master
> >> > > > > > > > until it could pass these criteria. The big
exemption
> we've
> >> had
> >> > > to
> >> > > > > > > > this was the hbase-spark integration, where
we all agreed
> it
> >> > > could
> >> > > > > > > > land in master because it was very well isolated
(the
> slide
> >> > away
> >> > > > from
> >> > > > > > > > including docs as a first-class part of building
up that
> >> > > > integration
> >> > > > > > > > has led me to doubt the wisdom of this decision).
> >> > > > > > > >
> >> > > > > > > > We've also been treating inclusion in a "probably
will be
> >> > > released
> >> > > > to
> >> > > > > > > > downstream" branches as a higher bar, requiring
> >> > > > > > > >
> >> > > > > > > > * don't moderately impact performance when
the feature
> >> isn't in
> >> > > use
> >> > > > > > > > * don't severely impact performance when
the feature is in
> >> use
> >> > > > > > > > * either default-to-on or show enough demand
to believe a
> >> > > > non-trivial
> >> > > > > > > > number of folks will turn the feature on
> >> > > > > > > >
> >> > > > > > > > The above has kept MOB and hbase-spark integration
out of
> >> > > branch-1,
> >> > > > > > > > presumably while they've "gotten more stable"
in master
> from
> >> > the
> >> > > > odd
> >> > > > > > > > vendor inclusion.
> >> > > > > > > >
> >> > > > > > > > Are we going to have a 2.0 release before
the end of the
> >> year?
> >> > > > We're
> >> > > > > > > > coming up on 1.5 years since the release
of version 1.0;
> >> seems
> >> > > like
> >> > > > > > > > it's about time, though I haven't seen any
concrete plans
> >> this
> >> > > > year.
> >> > > > > > > > Presuming we are going to have one by the
end of the year,
> >> it
> >> > > > seems a
> >> > > > > > > > bit close to still be adding in "features
that need
> >> maturing"
> >> > on
> >> > > > the
> >> > > > > > > > branch.
> >> > > > > > > >
> >> > > > > > > > The lack of a concrete plan for 2.0 keeps
me from
> >> considering
> >> > > these
> >> > > > > > > > things blocker at the moment. But I know
first hand how
> much
> >> > > > trouble
> >> > > > > > > > folks have had with other features that have
gone into
> >> > downstream
> >> > > > > > > > facing releases without robustness checks
(i.e.
> >> replication),
> >> > and
> >> > > > I'm
> >> > > > > > > > concerned about what we're setting up if
2.0 goes out with
> >> this
> >> > > > > > > > feature in its current state.
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > Best regards,
> >> > > > >
> >> > > > >    - Andy
> >> > > > >
> >> > > > > Problems worthy of attack prove their worth by hitting back.
-
> >> Piet
> >> > > Hein
> >> > > > > (via Tom White)
> >> > > > >
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Best regards,
> >> > >
> >> > >    - Andy
> >> > >
> >> > > Problems worthy of attack prove their worth by hitting back. - Piet
> >> Hein
> >> > > (via Tom White)
> >> > >
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message