hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912
Date Fri, 09 Sep 2016 19:16:09 GMT
Do not worry Sean, doc is coming today as a preview and our writer Frank
will be working on a putting  it into Apache repo. Timeline depends on
Franks schedule but I hope we will get it rather sooner than later.

As for failure testing, we are focusing only on a consistent state of
backup system data in a presence of any type of failures, We are not going
to implement  anything more "fancy", than that. We allow both: backup and
restore to fail. What we do not allow is to have system data corrupted.
Will it suffice for you? Do you have any other concerns, you want us to
address?

-Vlad


On Fri, Sep 9, 2016 at 10:56 AM, Sean Busbey <busbey@apache.org> wrote:

> "docs will come to Apache soon" does not address my concern around docs at
> all, unless said docs have already made it into the project repo. I don't
> want third party resources for using a major and important feature of the
> project, I want us to provide end users with what they need to get the job
> done.
>
> I see some calls for patience on the failure testing, but the appeal to us
> having done a bad job of requiring proper tests of previous features just
> makes me more concerned about not getting them here. I don't want to set
> yet another bad example that will then be pointed to in the future.
>
> On Sep 8, 2016 10:50, "Ted Yu" <yuzhihong@gmail.com> wrote:
>
> > Is there any concern which is not addressed ?
> >
> > Do we need another Vote thread ?
> >
> > Thanks
> >
> > On Thu, Sep 8, 2016 at 9:21 AM, Andrew Purtell <apurtell@apache.org>
> > wrote:
> >
> > > Vlad,
> > >
> > > I apologize for using the term 'half-baked' in a way that could seem a
> > > description of HBASE-7912. I meant that as a general hypothetical.
> > >
> > > On Wed, Sep 7, 2016 at 9:36 AM, Vladimir Rodionov <
> > vladrodionov@gmail.com>
> > > wrote:
> > >
> > > > >> I'm not sure that "There is already lots of half-baked code in
the
> > > > branch,
> > > > so what's the harm in adding more?"
> > > >
> > > > I meant - not production - ready yet. This is 2.0 development branch
> > and,
> > > > hence many features are in works,
> > > > not being tested well etc. I do not consider backup as half baked
> > > feature -
> > > > it has passed our internal QA and has very good doc, which we will
> > > provide
> > > > to Apache shortly.
> > > >
> > > > -Vlad
> > > >
> > > > On Wed, Sep 7, 2016 at 9:13 AM, Andrew Purtell <apurtell@apache.org>
> > > > wrote:
> > > >
> > > > > We shouldn't admit half baked changes that won't be finished.
> However
> > > in
> > > > > this case the crew working on this feature are long timers and less
> > > > likely
> > > > > than just about anyone to leave something in a half baked state.
Of
> > > > course
> > > > > there is no guarantee how anything will turn out, but I am willing
> to
> > > > take
> > > > > a little on faith if they feel their best path forward now is to
> > merge
> > > to
> > > > > trunk. I only wish I had bandwidth to have done some real kicking
> of
> > > the
> > > > > tires by now. Maybe this week.
> > > > >
> > > > > (Yes, I'm using some of that time for this email :-) but I type
> > fast.)
> > > > >
> > > > > That said, I would like to agitate for making 2.0 more real and
> spend
> > > > some
> > > > > time on it now that I'm winding down with 0.98. I think that means
> > > > > branching for 2.0 real soon now and even evicting things from 2.0
> > > branch
> > > > > that aren't finished or stable, leaving them only once again in the
> > > > master
> > > > > branch. Or, maybe just evicting them. Let's take it case by case.
> > > > >
> > > > > I think this feature can come in relatively safely. As added
> > insurance,
> > > > > let's admit the possibility it could be reverted on the 2.0 branch
> if
> > > > folks
> > > > > working on stabilizing 2.0 decide to evict it because it is
> > unfinished
> > > or
> > > > > unstable, because that certainly can happen. I would expect if talk
> > > like
> > > > > that starts, we'd get help finishing or stabilizing what's under
> > > > discussion
> > > > > for revert. Or, we'd have a revert. Either way the outcome is
> > > acceptable.
> > > > >
> > > > >
> > > > > On Wed, Sep 7, 2016 at 8:56 AM, Dima Spivak <dimaspivak@apache.org
> >
> > > > wrote:
> > > > >
> > > > > > I'm not sure that "There is already lots of half-baked code
in
> the
> > > > > branch,
> > > > > > so what's the harm in adding more?" is a good code commit
> > philosophy
> > > > for
> > > > > a
> > > > > > fault-tolerant distributed data store. ;)
> > > > > >
> > > > > > More seriously, a lack of test coverage for existing features
> > > shouldn't
> > > > > be
> > > > > > used as justification for introducing new features with the
same
> > > > > > shortcomings. Ultimately, it's the end user who will feel the
> pain,
> > > so
> > > > > > shouldn't we do everything we can to mitigate that?
> > > > > >
> > > > > > -Dima
> > > > > >
> > > > > > On Wed, Sep 7, 2016 at 8:46 AM, Vladimir Rodionov <
> > > > > vladrodionov@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Sean,
> > > > > > >
> > > > > > > * have docs
> > > > > > >
> > > > > > > Agree. We have a doc and backup is the most documented
feature
> > :),
> > > we
> > > > > > will
> > > > > > > release it shortly to Apache.
> > > > > > >
> > > > > > > * have sunny-day correctness tests
> > > > > > >
> > > > > > > Feature has  close to 60 test cases, which run for approx
30
> min.
> > > We
> > > > > can
> > > > > > > add more, if community do not mind :)
> > > > > > >
> > > > > > > * have correctness-in-face-of-failure tests
> > > > > > >
> > > > > > > Any examples of these tests in existing features? In works,
we
> > > have a
> > > > > > clear
> > > > > > > understanding of what should be done by the time of 2.0
> release.
> > > > > > > That is very close goal for us, to verify IT monkey for
> existing
> > > > code.
> > > > > > >
> > > > > > > * don't rely on things outside of HBase for normal operation
> > (okay
> > > > for
> > > > > > > advanced operation)
> > > > > > >
> > > > > > > We do not.
> > > > > > >
> > > > > > > Enormous time has been spent already on the development
and
> > testing
> > > > the
> > > > > > > feature, it has passed our internal tests and many rounds
of
> code
> > > > > reviews
> > > > > > > by HBase committers. We do not mind if someone from HBase
> > community
> > > > > > > (outside of HW) will review the code, but it will probably
> takes
> > > > > forever
> > > > > > to
> > > > > > > wait for volunteer?, the feature is quite large (1MB+
> cumulative
> > > > patch)
> > > > > > >
> > > > > > > 2.0 branch is full of half baked features, most of them
are in
> > > active
> > > > > > > development, therefore I am not following you here, Sean?
Why
> > > > > HBASE-7912
> > > > > > is
> > > > > > > not good enough yet to be integrated into 2.0 branch?
> > > > > > >
> > > > > > > -Vlad
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Sep 7, 2016 at 8:23 AM, Sean Busbey <busbey@apache.org
> >
> > > > wrote:
> > > > > > >
> > > > > > > > On Tue, Sep 6, 2016 at 10:36 PM, Josh Elser <
> > > josh.elser@gmail.com>
> > > > > > > wrote:
> > > > > > > > > So, the answer to Sean's original question is
"as robust as
> > > > > snapshots
> > > > > > > > > presently are"? (independence of backup/restore
failure
> > > tolerance
> > > > > > from
> > > > > > > > > snapshot failure tolerance)
> > > > > > > > >
> > > > > > > > > Is this just a question WRT context of the change,
or is it
> > > means
> > > > > > for a
> > > > > > > > veto
> > > > > > > > > from you, Sean? Just trying to make sure I'm
following
> along
> > > > > > > adequately.
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > > I'd say ATM I'm -0, bordering on -1 but not for reasons
I can
> > > > > > articulate
> > > > > > > > well.
> > > > > > > >
> > > > > > > > Here's an attempt.
> > > > > > > >
> > > > > > > > We've been trying to move, as a community, towards
minimizing
> > > risk
> > > > to
> > > > > > > > downstream folks by getting "complete enough for use"
gates
> in
> > > > place
> > > > > > > > before we introduce new features. This was spurred
by a some
> > > > features
> > > > > > > > getting in half-baked and never making it to "can
really use"
> > > > status
> > > > > > > > (I'm thinking of distributed log replay and the zk-less
> > > assignment
> > > > > > > > stuff, I don't recall if there was more).
> > > > > > > >
> > > > > > > > The gates, generally, included things like:
> > > > > > > >
> > > > > > > > * have docs
> > > > > > > > * have sunny-day correctness tests
> > > > > > > > * have correctness-in-face-of-failure tests
> > > > > > > > * don't rely on things outside of HBase for normal
operation
> > > (okay
> > > > > for
> > > > > > > > advanced operation)
> > > > > > > >
> > > > > > > > As an example, we kept the MOB work off in a branch
and out
> of
> > > > master
> > > > > > > > until it could pass these criteria. The big exemption
we've
> had
> > > to
> > > > > > > > this was the hbase-spark integration, where we all
agreed it
> > > could
> > > > > > > > land in master because it was very well isolated (the
slide
> > away
> > > > from
> > > > > > > > including docs as a first-class part of building up
that
> > > > integration
> > > > > > > > has led me to doubt the wisdom of this decision).
> > > > > > > >
> > > > > > > > We've also been treating inclusion in a "probably
will be
> > > released
> > > > to
> > > > > > > > downstream" branches as a higher bar, requiring
> > > > > > > >
> > > > > > > > * don't moderately impact performance when the feature
isn't
> in
> > > use
> > > > > > > > * don't severely impact performance when the feature
is in
> use
> > > > > > > > * either default-to-on or show enough demand to believe
a
> > > > non-trivial
> > > > > > > > number of folks will turn the feature on
> > > > > > > >
> > > > > > > > The above has kept MOB and hbase-spark integration
out of
> > > branch-1,
> > > > > > > > presumably while they've "gotten more stable" in master
from
> > the
> > > > odd
> > > > > > > > vendor inclusion.
> > > > > > > >
> > > > > > > > Are we going to have a 2.0 release before the end
of the
> year?
> > > > We're
> > > > > > > > coming up on 1.5 years since the release of version
1.0;
> seems
> > > like
> > > > > > > > it's about time, though I haven't seen any concrete
plans
> this
> > > > year.
> > > > > > > > Presuming we are going to have one by the end of the
year, it
> > > > seems a
> > > > > > > > bit close to still be adding in "features that need
maturing"
> > on
> > > > the
> > > > > > > > branch.
> > > > > > > >
> > > > > > > > The lack of a concrete plan for 2.0 keeps me from
considering
> > > these
> > > > > > > > things blocker at the moment. But I know first hand
how much
> > > > trouble
> > > > > > > > folks have had with other features that have gone
into
> > downstream
> > > > > > > > facing releases without robustness checks (i.e. replication),
> > and
> > > > I'm
> > > > > > > > concerned about what we're setting up if 2.0 goes
out with
> this
> > > > > > > > feature in its current state.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best regards,
> > > > >
> > > > >    - Andy
> > > > >
> > > > > Problems worthy of attack prove their worth by hitting back. - Piet
> > > Hein
> > > > > (via Tom White)
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > >
> > >    - Andy
> > >
> > > Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > > (via Tom White)
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message