hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912
Date Mon, 10 Oct 2016 20:44:42 GMT
>> mapreduce dependency has been moved to client side - no mapreduce job

1. We have no code in the client module anymore, due to dependency on
internal server API (HFile and WAL access).
2. Backup/ restore are client - driven operations, but all the code resides
in the server module
3. No MR in Master, no procedure - driven execution.
4. Old good MR from command-line.
5. Security was simplified and now only super-user is allowed to run
backup/restores.
6. HBase Backup API was gone due to 1. Now only command-line access to
backup tools.

These consequences of refactoring has been discussed in HBASE-16727.

-Vlad




On Mon, Oct 10, 2016 at 1:31 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Reviving this thread.
>
> The following has taken place:
>
> mapreduce dependency has been moved to client side - no mapreduce job
> launched from master or region server.
> document patch (HBASE-16574) has been integrated.
> Updated mega patch has been attached to HBASE-14123: this covers the
> refactor in #1 above and the protobuf 3 merge.
>
> If community has more feedback on the merge proposal, I would love to hear
> it.
>
> Thanks
>
> On Thu, Sep 22, 2016 at 10:31 AM, Sean Busbey <busbey@cloudera.com> wrote:
>
> > I'd like to see the docs proposed on HBASE-16574 integrated into our
> > project's documentation prior to merge.
> >
> > On Thu, Sep 22, 2016 at 9:02 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> > > This feature can be marked experimental due to some limitations such as
> > > security.
> > >
> > > Your previous round of comments have been addressed.
> > > Command line tool has gone through:
> > >
> > > HBASE-16620 Fix backup command-line tool usability issues
> > > HBASE-16655 hbase backup describe with incorrect backup id results in
> NPE
> > >
> > > The updated doc has been attached to HBASE-16574.
> > >
> > > Cheers
> > >
> > > On Thu, Sep 22, 2016 at 8:53 AM, Stack <stack@duboce.net> wrote:
> > >
> > >> On Wed, Sep 21, 2016 at 7:43 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> > >>
> > >> > Are there more (review) comments ?
> > >> >
> > >> >
> > >> Are outstanding comments addressed?
> > >>
> > >> I don't see answer to my 'is this experimental/will it be marked
> > >> experimental' question.
> > >>
> > >> I ran into some issues trying to use the feature and suggested that a
> > >> feature likes this needs polish else it'll just rot, unused. Has
> polish
> > >> been applied? All ready for another 'user' test? Suggest that you
> update
> > >> here going forward for the benefit of those trying to follow along and
> > who
> > >> are not watching JIRA change fly-by.
> > >>
> > >> It looks like doc got a revision -- I have to check -- to take on
> > >> suggestion made above but again, suggest, that this thread gets
> updated.
> > >>
> > >> Thanks,
> > >> St.Ack
> > >>
> > >>
> > >>
> > >> > Thanks
> > >> >
> > >> > On Tue, Sep 20, 2016 at 10:02 AM, Devaraj Das <ddas@hortonworks.com
> >
> > >> > wrote:
> > >> >
> > >> > > Just reviving this thread. Thanks Sean, Stack, Dima, and others
> for
> > the
> > >> > > thorough reviews and testing. Thanks Ted and Vlad for taking care
> of
> > >> the
> > >> > > feedback. Are we all good to do the merge now? Rather do sooner
> than
> > >> > later.
> > >> > > ________________________________________
> > >> > > From: saint.ack@gmail.com <saint.ack@gmail.com> on behalf of
> Stack
> > <
> > >> > > stack@duboce.net>
> > >> > > Sent: Monday, September 12, 2016 1:18 PM
> > >> > > To: HBase Dev List
> > >> > > Subject: Re: [DISCUSSION] Merge Backup / Restore - Branch
> HBASE-7912
> > >> > >
> > >> > > On Mon, Sep 12, 2016 at 12:19 PM, Ted Yu <yuzhihong@gmail.com>
> > wrote:
> > >> > >
> > >> > > > Mega patch (rev 18) is on HBASE-14123.
> > >> > > >
> > >> > > > Please comment on HBASE-14123 on how you want to review.
> > >> > > >
> > >> > >
> > >> > >
> > >> > > Yeah. That was my lost tab. Last rb was 6 months ago. Suggest
> > updating
> > >> > it.
> > >> > > RB is pretty good for review. Patch is only 1.5M so should be
> fine.
> > >> > >
> > >> > > St.Ack
> > >> > >
> > >> > >
> > >> > > >
> > >> > > > Thanks
> > >> > > >
> > >> > > > On Mon, Sep 12, 2016 at 12:15 PM, Stack <stack@duboce.net>
> wrote:
> > >> > > >
> > >> > > > > On review of the 'patch', do I just compare the branch to
> > master or
> > >> > is
> > >> > > > > there a megapatch posted somewhere (I think I saw one but it
> > seemed
> > >> > > stale
> > >> > > > > and then I 'lost' the tab). Sorry for dumb question.
> > >> > > > > St.Ack
> > >> > > > >
> > >> > > > > On Mon, Sep 12, 2016 at 12:01 PM, Stack <stack@duboce.net>
> > wrote:
> > >> > > > >
> > >> > > > > > Late to the game. A few comments after rereading this thread
> > as a
> > >> > > > 'user'.
> > >> > > > > >
> > >> > > > > > + Before merge, a user-facing feature like this should work
> > (If
> > >> > this
> > >> > > is
> > >> > > > > "higher-bar
> > >> > > > > > for new features", bring it on -- smile).
> > >> > > > > > + As a user, I tried the branch with tools after reviewing
> the
> > >> > > > > just-posted
> > >> > > > > > doc. I had an 'interesting' experience (left comments up on
> > >> > issue). I
> > >> > > > > think
> > >> > > > > > the tooling/doc. important to get right. If it breaks easily
> > or
> > >> is
> > >> > > > > > inconsistent (or lacks 'polish'), operators will judge the
> > whole
> > >> > > > > > backup/restore tooling chain as not trustworthy and abandon
> > it.
> > >> > Lets
> > >> > > > not
> > >> > > > > > have this happen to this feature.
> > >> > > > > > + Matteo's suggestion (with a helpful starter list) that
> there
> > >> > needs
> > >> > > to
> > >> > > > > be
> > >> > > > > > explicit qualification on what is actually being delivered
> --
> > >> > > > including a
> > >> > > > > > listing of limitations (some look serious such as data bleed
> > from
> > >> > > other
> > >> > > > > > regions in WALs, but maybe I don't care for my use case...)
> --
> > >> > needs
> > >> > > to
> > >> > > > > > accompany the merge. Lets fold them into the user doc. in
> the
> > >> > > technical
> > >> > > > > > overview area as suggested so user expectations are properly
> > >> > managed
> > >> > > > > > (otherwise, they expect the world and will just give up when
> > we
> > >> > fall
> > >> > > > > > short). Vladimir did a list of what is in each of the phases
> > >> above
> > >> > > > which
> > >> > > > > > would serve as a good start.
> > >> > > > > > + Is this feature 'experimental' (Matteo asks above). I'd
> > prefer
> > >> it
> > >> > > is
> > >> > > > > > not. If it is, it should be labelled all over that it is
> so. I
> > >> see
> > >> > > > > current
> > >> > > > > > state called out as a '... technical preview feature'. Does
> > this
> > >> > mean
> > >> > > > > > not-for-users?
> > >> > > > > >
> > >> > > > > > St.Ack
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > On Mon, Sep 12, 2016 at 8:03 AM, Ted Yu <
> yuzhihong@gmail.com>
> > >> > wrote:
> > >> > > > > >
> > >> > > > > >> Sean:
> > >> > > > > >> Do you have more comments ?
> > >> > > > > >>
> > >> > > > > >> Cheers
> > >> > > > > >>
> > >> > > > > >> On Fri, Sep 9, 2016 at 1:42 PM, Vladimir Rodionov <
> > >> > > > > vladrodionov@gmail.com
> > >> > > > > >> >
> > >> > > > > >> wrote:
> > >> > > > > >>
> > >> > > > > >> > Sean,
> > >> > > > > >> >
> > >> > > > > >> > Backup/Restore can fail due to various reasons: network
> > outage
> > >> > > > > (cluster
> > >> > > > > >> > wide), various time-outs in HBase and HDFS layer, M/R
> > failure
> > >> > due
> > >> > > to
> > >> > > > > >> "HDFS
> > >> > > > > >> > exceeded quota", user error (manual deletion of data) and
> > so
> > >> on
> > >> > so
> > >> > > > on.
> > >> > > > > >> That
> > >> > > > > >> > is impossible to enumerate all possible types of failures
> > in a
> > >> > > > > >> distributed
> > >> > > > > >> > system - that is not our goal/task.
> > >> > > > > >> >
> > >> > > > > >> > We focus completely on backup system table consistency
> in a
> > >> > > presence
> > >> > > > > of
> > >> > > > > >> any
> > >> > > > > >> > type of failure. That is what I call "tolerance to
> > failures".
> > >> > > > > >> >
> > >> > > > > >> > On a failure:
> > >> > > > > >> >
> > >> > > > > >> > BACKUP. All backup system information (prior to backup)
> > will
> > >> be
> > >> > > > > restored
> > >> > > > > >> > and all temporary data, related to a failed session, in
> > HDFS
> > >> > will
> > >> > > be
> > >> > > > > >> > deleted
> > >> > > > > >> > RESTORE. We do not care about system data, because
> restore
> > >> does
> > >> > > not
> > >> > > > > >> change
> > >> > > > > >> > it. Temporary data in HDFS will be cleaned up and table
> > will
> > >> be
> > >> > > in a
> > >> > > > > >> state
> > >> > > > > >> > back to where it was before operation started.
> > >> > > > > >> >
> > >> > > > > >> > This is what user should expect in case of a failure.
> > >> > > > > >> >
> > >> > > > > >> > -Vlad
> > >> > > > > >> >
> > >> > > > > >> >
> > >> > > > > >> > -Vlad
> > >> > > > > >> >
> > >> > > > > >> > On Fri, Sep 9, 2016 at 12:56 PM, Sean Busbey <
> > >> busbey@apache.org
> > >> > >
> > >> > > > > wrote:
> > >> > > > > >> >
> > >> > > > > >> > > Failing in a consistent way, with docs that explain the
> > >> > various
> > >> > > > > >> > > expected failures would be sufficient.
> > >> > > > > >> > >
> > >> > > > > >> > > On Fri, Sep 9, 2016 at 12:16 PM, Vladimir Rodionov
> > >> > > > > >> > > <vladrodionov@gmail.com> wrote:
> > >> > > > > >> > > > Do not worry Sean, doc is coming today as a preview
> and
> > >> our
> > >> > > > writer
> > >> > > > > >> > Frank
> > >> > > > > >> > > > will be working on a putting  it into Apache repo.
> > >> Timeline
> > >> > > > > depends
> > >> > > > > >> on
> > >> > > > > >> > > > Franks schedule but I hope we will get it rather
> sooner
> > >> than
> > >> > > > > later.
> > >> > > > > >> > > >
> > >> > > > > >> > > > As for failure testing, we are focusing only on a
> > >> consistent
> > >> > > > state
> > >> > > > > >> of
> > >> > > > > >> > > > backup system data in a presence of any type of
> > failures,
> > >> We
> > >> > > are
> > >> > > > > not
> > >> > > > > >> > > going
> > >> > > > > >> > > > to implement  anything more "fancy", than that. We
> > allow
> > >> > both:
> > >> > > > > >> backup
> > >> > > > > >> > and
> > >> > > > > >> > > > restore to fail. What we do not allow is to have
> system
> > >> data
> > >> > > > > >> corrupted.
> > >> > > > > >> > > > Will it suffice for you? Do you have any other
> > concerns,
> > >> you
> > >> > > > want
> > >> > > > > >> us to
> > >> > > > > >> > > > address?
> > >> > > > > >> > > >
> > >> > > > > >> > > > -Vlad
> > >> > > > > >> > > >
> > >> > > > > >> > > >
> > >> > > > > >> > > > On Fri, Sep 9, 2016 at 10:56 AM, Sean Busbey <
> > >> > > busbey@apache.org
> > >> > > > >
> > >> > > > > >> > wrote:
> > >> > > > > >> > > >
> > >> > > > > >> > > >> "docs will come to Apache soon" does not address my
> > >> concern
> > >> > > > > around
> > >> > > > > >> > docs
> > >> > > > > >> > > at
> > >> > > > > >> > > >> all, unless said docs have already made it into the
> > >> project
> > >> > > > > repo. I
> > >> > > > > >> > > don't
> > >> > > > > >> > > >> want third party resources for using a major and
> > >> important
> > >> > > > > feature
> > >> > > > > >> of
> > >> > > > > >> > > the
> > >> > > > > >> > > >> project, I want us to provide end users with what
> they
> > >> need
> > >> > > to
> > >> > > > > get
> > >> > > > > >> the
> > >> > > > > >> > > job
> > >> > > > > >> > > >> done.
> > >> > > > > >> > > >>
> > >> > > > > >> > > >> I see some calls for patience on the failure
> testing,
> > but
> > >> > the
> > >> > > > > >> appeal
> > >> > > > > >> > to
> > >> > > > > >> > > us
> > >> > > > > >> > > >> having done a bad job of requiring proper tests of
> > >> previous
> > >> > > > > >> features
> > >> > > > > >> > > just
> > >> > > > > >> > > >> makes me more concerned about not getting them
> here. I
> > >> > don't
> > >> > > > want
> > >> > > > > >> to
> > >> > > > > >> > set
> > >> > > > > >> > > >> yet another bad example that will then be pointed to
> > in
> > >> the
> > >> > > > > future.
> > >> > > > > >> > > >>
> > >> > > > > >> > > >> On Sep 8, 2016 10:50, "Ted Yu" <yuzhihong@gmail.com
> >
> > >> > wrote:
> > >> > > > > >> > > >>
> > >> > > > > >> > > >> > Is there any concern which is not addressed ?
> > >> > > > > >> > > >> >
> > >> > > > > >> > > >> > Do we need another Vote thread ?
> > >> > > > > >> > > >> >
> > >> > > > > >> > > >> > Thanks
> > >> > > > > >> > > >> >
> > >> > > > > >> > > >> > On Thu, Sep 8, 2016 at 9:21 AM, Andrew Purtell <
> > >> > > > > >> apurtell@apache.org
> > >> > > > > >> > >
> > >> > > > > >> > > >> > wrote:
> > >> > > > > >> > > >> >
> > >> > > > > >> > > >> > > Vlad,
> > >> > > > > >> > > >> > >
> > >> > > > > >> > > >> > > I apologize for using the term 'half-baked' in a
> > way
> > >> > that
> > >> > > > > could
> > >> > > > > >> > > seem a
> > >> > > > > >> > > >> > > description of HBASE-7912. I meant that as a
> > general
> > >> > > > > >> hypothetical.
> > >> > > > > >> > > >> > >
> > >> > > > > >> > > >> > > On Wed, Sep 7, 2016 at 9:36 AM, Vladimir
> Rodionov
> > <
> > >> > > > > >> > > >> > vladrodionov@gmail.com>
> > >> > > > > >> > > >> > > wrote:
> > >> > > > > >> > > >> > >
> > >> > > > > >> > > >> > > > >> I'm not sure that "There is already lots of
> > >> > > half-baked
> > >> > > > > >> code
> > >> > > > > >> > in
> > >> > > > > >> > > the
> > >> > > > > >> > > >> > > > branch,
> > >> > > > > >> > > >> > > > so what's the harm in adding more?"
> > >> > > > > >> > > >> > > >
> > >> > > > > >> > > >> > > > I meant - not production - ready yet. This is
> > 2.0
> > >> > > > > development
> > >> > > > > >> > > branch
> > >> > > > > >> > > >> > and,
> > >> > > > > >> > > >> > > > hence many features are in works,
> > >> > > > > >> > > >> > > > not being tested well etc. I do not consider
> > backup
> > >> > as
> > >> > > > half
> > >> > > > > >> > baked
> > >> > > > > >> > > >> > > feature -
> > >> > > > > >> > > >> > > > it has passed our internal QA and has very
> good
> > >> doc,
> > >> > > > which
> > >> > > > > we
> > >> > > > > >> > will
> > >> > > > > >> > > >> > > provide
> > >> > > > > >> > > >> > > > to Apache shortly.
> > >> > > > > >> > > >> > > >
> > >> > > > > >> > > >> > > > -Vlad
> > >> > > > > >> > > >> > > >
> > >> > > > > >> > > >> > > > On Wed, Sep 7, 2016 at 9:13 AM, Andrew
> Purtell <
> > >> > > > > >> > > apurtell@apache.org>
> > >> > > > > >> > > >> > > > wrote:
> > >> > > > > >> > > >> > > >
> > >> > > > > >> > > >> > > > > We shouldn't admit half baked changes that
> > won't
> > >> be
> > >> > > > > >> finished.
> > >> > > > > >> > > >> However
> > >> > > > > >> > > >> > > in
> > >> > > > > >> > > >> > > > > this case the crew working on this feature
> are
> > >> long
> > >> > > > > timers
> > >> > > > > >> and
> > >> > > > > >> > > less
> > >> > > > > >> > > >> > > > likely
> > >> > > > > >> > > >> > > > > than just about anyone to leave something
> in a
> > >> half
> > >> > > > baked
> > >> > > > > >> > > state. Of
> > >> > > > > >> > > >> > > > course
> > >> > > > > >> > > >> > > > > there is no guarantee how anything will turn
> > out,
> > >> > > but I
> > >> > > > > am
> > >> > > > > >> > > willing
> > >> > > > > >> > > >> to
> > >> > > > > >> > > >> > > > take
> > >> > > > > >> > > >> > > > > a little on faith if they feel their best
> path
> > >> > > forward
> > >> > > > > now
> > >> > > > > >> is
> > >> > > > > >> > to
> > >> > > > > >> > > >> > merge
> > >> > > > > >> > > >> > > to
> > >> > > > > >> > > >> > > > > trunk. I only wish I had bandwidth to have
> > done
> > >> > some
> > >> > > > real
> > >> > > > > >> > > kicking
> > >> > > > > >> > > >> of
> > >> > > > > >> > > >> > > the
> > >> > > > > >> > > >> > > > > tires by now. Maybe this week.
> > >> > > > > >> > > >> > > > >
> > >> > > > > >> > > >> > > > > (Yes, I'm using some of that time for this
> > email
> > >> > :-)
> > >> > > > but
> > >> > > > > I
> > >> > > > > >> > type
> > >> > > > > >> > > >> > fast.)
> > >> > > > > >> > > >> > > > >
> > >> > > > > >> > > >> > > > > That said, I would like to agitate for
> making
> > 2.0
> > >> > > more
> > >> > > > > real
> > >> > > > > >> > and
> > >> > > > > >> > > >> spend
> > >> > > > > >> > > >> > > > some
> > >> > > > > >> > > >> > > > > time on it now that I'm winding down with
> > 0.98. I
> > >> > > think
> > >> > > > > >> that
> > >> > > > > >> > > means
> > >> > > > > >> > > >> > > > > branching for 2.0 real soon now and even
> > evicting
> > >> > > > things
> > >> > > > > >> from
> > >> > > > > >> > > 2.0
> > >> > > > > >> > > >> > > branch
> > >> > > > > >> > > >> > > > > that aren't finished or stable, leaving them
> > only
> > >> > > once
> > >> > > > > >> again
> > >> > > > > >> > in
> > >> > > > > >> > > the
> > >> > > > > >> > > >> > > > master
> > >> > > > > >> > > >> > > > > branch. Or, maybe just evicting them. Let's
> > take
> > >> it
> > >> > > > case
> > >> > > > > by
> > >> > > > > >> > > case.
> > >> > > > > >> > > >> > > > >
> > >> > > > > >> > > >> > > > > I think this feature can come in relatively
> > >> safely.
> > >> > > As
> > >> > > > > >> added
> > >> > > > > >> > > >> > insurance,
> > >> > > > > >> > > >> > > > > let's admit the possibility it could be
> > reverted
> > >> on
> > >> > > the
> > >> > > > > 2.0
> > >> > > > > >> > > branch
> > >> > > > > >> > > >> if
> > >> > > > > >> > > >> > > > folks
> > >> > > > > >> > > >> > > > > working on stabilizing 2.0 decide to evict
> it
> > >> > because
> > >> > > > it
> > >> > > > > is
> > >> > > > > >> > > >> > unfinished
> > >> > > > > >> > > >> > > or
> > >> > > > > >> > > >> > > > > unstable, because that certainly can
> happen. I
> > >> > would
> > >> > > > > >> expect if
> > >> > > > > >> > > talk
> > >> > > > > >> > > >> > > like
> > >> > > > > >> > > >> > > > > that starts, we'd get help finishing or
> > >> stabilizing
> > >> > > > > what's
> > >> > > > > >> > under
> > >> > > > > >> > > >> > > > discussion
> > >> > > > > >> > > >> > > > > for revert. Or, we'd have a revert. Either
> way
> > >> the
> > >> > > > > outcome
> > >> > > > > >> is
> > >> > > > > >> > > >> > > acceptable.
> > >> > > > > >> > > >> > > > >
> > >> > > > > >> > > >> > > > >
> > >> > > > > >> > > >> > > > > On Wed, Sep 7, 2016 at 8:56 AM, Dima Spivak
> <
> > >> > > > > >> > > dimaspivak@apache.org
> > >> > > > > >> > > >> >
> > >> > > > > >> > > >> > > > wrote:
> > >> > > > > >> > > >> > > > >
> > >> > > > > >> > > >> > > > > > I'm not sure that "There is already lots
> of
> > >> > > > half-baked
> > >> > > > > >> code
> > >> > > > > >> > in
> > >> > > > > >> > > >> the
> > >> > > > > >> > > >> > > > > branch,
> > >> > > > > >> > > >> > > > > > so what's the harm in adding more?" is a
> > good
> > >> > code
> > >> > > > > commit
> > >> > > > > >> > > >> > philosophy
> > >> > > > > >> > > >> > > > for
> > >> > > > > >> > > >> > > > > a
> > >> > > > > >> > > >> > > > > > fault-tolerant distributed data store. ;)
> > >> > > > > >> > > >> > > > > >
> > >> > > > > >> > > >> > > > > > More seriously, a lack of test coverage
> for
> > >> > > existing
> > >> > > > > >> > features
> > >> > > > > >> > > >> > > shouldn't
> > >> > > > > >> > > >> > > > > be
> > >> > > > > >> > > >> > > > > > used as justification for introducing new
> > >> > features
> > >> > > > with
> > >> > > > > >> the
> > >> > > > > >> > > same
> > >> > > > > >> > > >> > > > > > shortcomings. Ultimately, it's the end
> user
> > who
> > >> > > will
> > >> > > > > feel
> > >> > > > > >> > the
> > >> > > > > >> > > >> pain,
> > >> > > > > >> > > >> > > so
> > >> > > > > >> > > >> > > > > > shouldn't we do everything we can to
> > mitigate
> > >> > that?
> > >> > > > > >> > > >> > > > > >
> > >> > > > > >> > > >> > > > > > -Dima
> > >> > > > > >> > > >> > > > > >
> > >> > > > > >> > > >> > > > > > On Wed, Sep 7, 2016 at 8:46 AM, Vladimir
> > >> > Rodionov <
> > >> > > > > >> > > >> > > > > vladrodionov@gmail.com>
> > >> > > > > >> > > >> > > > > > wrote:
> > >> > > > > >> > > >> > > > > >
> > >> > > > > >> > > >> > > > > > > Sean,
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > > * have docs
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > > Agree. We have a doc and backup is the
> > most
> > >> > > > > documented
> > >> > > > > >> > > feature
> > >> > > > > >> > > >> > :),
> > >> > > > > >> > > >> > > we
> > >> > > > > >> > > >> > > > > > will
> > >> > > > > >> > > >> > > > > > > release it shortly to Apache.
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > > * have sunny-day correctness tests
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > > Feature has  close to 60 test cases,
> which
> > >> run
> > >> > > for
> > >> > > > > >> approx
> > >> > > > > >> > 30
> > >> > > > > >> > > >> min.
> > >> > > > > >> > > >> > > We
> > >> > > > > >> > > >> > > > > can
> > >> > > > > >> > > >> > > > > > > add more, if community do not mind :)
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > > * have correctness-in-face-of-failure
> > tests
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > > Any examples of these tests in existing
> > >> > features?
> > >> > > > In
> > >> > > > > >> > works,
> > >> > > > > >> > > we
> > >> > > > > >> > > >> > > have a
> > >> > > > > >> > > >> > > > > > clear
> > >> > > > > >> > > >> > > > > > > understanding of what should be done by
> > the
> > >> > time
> > >> > > of
> > >> > > > > 2.0
> > >> > > > > >> > > >> release.
> > >> > > > > >> > > >> > > > > > > That is very close goal for us, to
> verify
> > IT
> > >> > > monkey
> > >> > > > > for
> > >> > > > > >> > > >> existing
> > >> > > > > >> > > >> > > > code.
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > > * don't rely on things outside of HBase
> > for
> > >> > > normal
> > >> > > > > >> > operation
> > >> > > > > >> > > >> > (okay
> > >> > > > > >> > > >> > > > for
> > >> > > > > >> > > >> > > > > > > advanced operation)
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > > We do not.
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > > Enormous time has been spent already on
> > the
> > >> > > > > development
> > >> > > > > >> > and
> > >> > > > > >> > > >> > testing
> > >> > > > > >> > > >> > > > the
> > >> > > > > >> > > >> > > > > > > feature, it has passed our internal
> tests
> > and
> > >> > > many
> > >> > > > > >> rounds
> > >> > > > > >> > of
> > >> > > > > >> > > >> code
> > >> > > > > >> > > >> > > > > reviews
> > >> > > > > >> > > >> > > > > > > by HBase committers. We do not mind if
> > >> someone
> > >> > > from
> > >> > > > > >> HBase
> > >> > > > > >> > > >> > community
> > >> > > > > >> > > >> > > > > > > (outside of HW) will review the code,
> but
> > it
> > >> > will
> > >> > > > > >> probably
> > >> > > > > >> > > >> takes
> > >> > > > > >> > > >> > > > > forever
> > >> > > > > >> > > >> > > > > > to
> > >> > > > > >> > > >> > > > > > > wait for volunteer?, the feature is
> quite
> > >> large
> > >> > > > (1MB+
> > >> > > > > >> > > >> cumulative
> > >> > > > > >> > > >> > > > patch)
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > > 2.0 branch is full of half baked
> features,
> > >> most
> > >> > > of
> > >> > > > > them
> > >> > > > > >> > are
> > >> > > > > >> > > in
> > >> > > > > >> > > >> > > active
> > >> > > > > >> > > >> > > > > > > development, therefore I am not
> following
> > you
> > >> > > here,
> > >> > > > > >> Sean?
> > >> > > > > >> > > Why
> > >> > > > > >> > > >> > > > > HBASE-7912
> > >> > > > > >> > > >> > > > > > is
> > >> > > > > >> > > >> > > > > > > not good enough yet to be integrated
> into
> > 2.0
> > >> > > > branch?
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > > -Vlad
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > > On Wed, Sep 7, 2016 at 8:23 AM, Sean
> > Busbey <
> > >> > > > > >> > > busbey@apache.org
> > >> > > > > >> > > >> >
> > >> > > > > >> > > >> > > > wrote:
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > > > > On Tue, Sep 6, 2016 at 10:36 PM, Josh
> > >> Elser <
> > >> > > > > >> > > >> > > josh.elser@gmail.com>
> > >> > > > > >> > > >> > > > > > > wrote:
> > >> > > > > >> > > >> > > > > > > > > So, the answer to Sean's original
> > >> question
> > >> > is
> > >> > > > "as
> > >> > > > > >> > > robust as
> > >> > > > > >> > > >> > > > > snapshots
> > >> > > > > >> > > >> > > > > > > > > presently are"? (independence of
> > >> > > backup/restore
> > >> > > > > >> > failure
> > >> > > > > >> > > >> > > tolerance
> > >> > > > > >> > > >> > > > > > from
> > >> > > > > >> > > >> > > > > > > > > snapshot failure tolerance)
> > >> > > > > >> > > >> > > > > > > > >
> > >> > > > > >> > > >> > > > > > > > > Is this just a question WRT context
> of
> > >> the
> > >> > > > > change,
> > >> > > > > >> or
> > >> > > > > >> > > is it
> > >> > > > > >> > > >> > > means
> > >> > > > > >> > > >> > > > > > for a
> > >> > > > > >> > > >> > > > > > > > veto
> > >> > > > > >> > > >> > > > > > > > > from you, Sean? Just trying to make
> > sure
> > >> > I'm
> > >> > > > > >> following
> > >> > > > > >> > > >> along
> > >> > > > > >> > > >> > > > > > > adequately.
> > >> > > > > >> > > >> > > > > > > > >
> > >> > > > > >> > > >> > > > > > > > >
> > >> > > > > >> > > >> > > > > > > >
> > >> > > > > >> > > >> > > > > > > > I'd say ATM I'm -0, bordering on -1
> but
> > not
> > >> > for
> > >> > > > > >> reasons
> > >> > > > > >> > I
> > >> > > > > >> > > can
> > >> > > > > >> > > >> > > > > > articulate
> > >> > > > > >> > > >> > > > > > > > well.
> > >> > > > > >> > > >> > > > > > > >
> > >> > > > > >> > > >> > > > > > > > Here's an attempt.
> > >> > > > > >> > > >> > > > > > > >
> > >> > > > > >> > > >> > > > > > > > We've been trying to move, as a
> > community,
> > >> > > > towards
> > >> > > > > >> > > minimizing
> > >> > > > > >> > > >> > > risk
> > >> > > > > >> > > >> > > > to
> > >> > > > > >> > > >> > > > > > > > downstream folks by getting "complete
> > >> enough
> > >> > > for
> > >> > > > > use"
> > >> > > > > >> > > gates
> > >> > > > > >> > > >> in
> > >> > > > > >> > > >> > > > place
> > >> > > > > >> > > >> > > > > > > > before we introduce new features. This
> > was
> > >> > > > spurred
> > >> > > > > >> by a
> > >> > > > > >> > > some
> > >> > > > > >> > > >> > > > features
> > >> > > > > >> > > >> > > > > > > > getting in half-baked and never making
> > it
> > >> to
> > >> > > "can
> > >> > > > > >> really
> > >> > > > > >> > > use"
> > >> > > > > >> > > >> > > > status
> > >> > > > > >> > > >> > > > > > > > (I'm thinking of distributed log
> replay
> > and
> > >> > the
> > >> > > > > >> zk-less
> > >> > > > > >> > > >> > > assignment
> > >> > > > > >> > > >> > > > > > > > stuff, I don't recall if there was
> > more).
> > >> > > > > >> > > >> > > > > > > >
> > >> > > > > >> > > >> > > > > > > > The gates, generally, included things
> > like:
> > >> > > > > >> > > >> > > > > > > >
> > >> > > > > >> > > >> > > > > > > > * have docs
> > >> > > > > >> > > >> > > > > > > > * have sunny-day correctness tests
> > >> > > > > >> > > >> > > > > > > > * have correctness-in-face-of-failure
> > tests
> > >> > > > > >> > > >> > > > > > > > * don't rely on things outside of
> HBase
> > for
> > >> > > > normal
> > >> > > > > >> > > operation
> > >> > > > > >> > > >> > > (okay
> > >> > > > > >> > > >> > > > > for
> > >> > > > > >> > > >> > > > > > > > advanced operation)
> > >> > > > > >> > > >> > > > > > > >
> > >> > > > > >> > > >> > > > > > > > As an example, we kept the MOB work
> off
> > in
> > >> a
> > >> > > > branch
> > >> > > > > >> and
> > >> > > > > >> > > out
> > >> > > > > >> > > >> of
> > >> > > > > >> > > >> > > > master
> > >> > > > > >> > > >> > > > > > > > until it could pass these criteria.
> The
> > big
> > >> > > > > exemption
> > >> > > > > >> > > we've
> > >> > > > > >> > > >> had
> > >> > > > > >> > > >> > > to
> > >> > > > > >> > > >> > > > > > > > this was the hbase-spark integration,
> > where
> > >> > we
> > >> > > > all
> > >> > > > > >> > agreed
> > >> > > > > >> > > it
> > >> > > > > >> > > >> > > could
> > >> > > > > >> > > >> > > > > > > > land in master because it was very
> well
> > >> > > isolated
> > >> > > > > (the
> > >> > > > > >> > > slide
> > >> > > > > >> > > >> > away
> > >> > > > > >> > > >> > > > from
> > >> > > > > >> > > >> > > > > > > > including docs as a first-class part
> of
> > >> > > building
> > >> > > > up
> > >> > > > > >> that
> > >> > > > > >> > > >> > > > integration
> > >> > > > > >> > > >> > > > > > > > has led me to doubt the wisdom of this
> > >> > > decision).
> > >> > > > > >> > > >> > > > > > > >
> > >> > > > > >> > > >> > > > > > > > We've also been treating inclusion in
> a
> > >> > > "probably
> > >> > > > > >> will
> > >> > > > > >> > be
> > >> > > > > >> > > >> > > released
> > >> > > > > >> > > >> > > > to
> > >> > > > > >> > > >> > > > > > > > downstream" branches as a higher bar,
> > >> > requiring
> > >> > > > > >> > > >> > > > > > > >
> > >> > > > > >> > > >> > > > > > > > * don't moderately impact performance
> > when
> > >> > the
> > >> > > > > >> feature
> > >> > > > > >> > > isn't
> > >> > > > > >> > > >> in
> > >> > > > > >> > > >> > > use
> > >> > > > > >> > > >> > > > > > > > * don't severely impact performance
> when
> > >> the
> > >> > > > > feature
> > >> > > > > >> is
> > >> > > > > >> > in
> > >> > > > > >> > > >> use
> > >> > > > > >> > > >> > > > > > > > * either default-to-on or show enough
> > >> demand
> > >> > to
> > >> > > > > >> believe
> > >> > > > > >> > a
> > >> > > > > >> > > >> > > > non-trivial
> > >> > > > > >> > > >> > > > > > > > number of folks will turn the feature
> on
> > >> > > > > >> > > >> > > > > > > >
> > >> > > > > >> > > >> > > > > > > > The above has kept MOB and hbase-spark
> > >> > > > integration
> > >> > > > > >> out
> > >> > > > > >> > of
> > >> > > > > >> > > >> > > branch-1,
> > >> > > > > >> > > >> > > > > > > > presumably while they've "gotten more
> > >> stable"
> > >> > > in
> > >> > > > > >> master
> > >> > > > > >> > > from
> > >> > > > > >> > > >> > the
> > >> > > > > >> > > >> > > > odd
> > >> > > > > >> > > >> > > > > > > > vendor inclusion.
> > >> > > > > >> > > >> > > > > > > >
> > >> > > > > >> > > >> > > > > > > > Are we going to have a 2.0 release
> > before
> > >> the
> > >> > > end
> > >> > > > > of
> > >> > > > > >> the
> > >> > > > > >> > > >> year?
> > >> > > > > >> > > >> > > > We're
> > >> > > > > >> > > >> > > > > > > > coming up on 1.5 years since the
> > release of
> > >> > > > version
> > >> > > > > >> 1.0;
> > >> > > > > >> > > >> seems
> > >> > > > > >> > > >> > > like
> > >> > > > > >> > > >> > > > > > > > it's about time, though I haven't seen
> > any
> > >> > > > concrete
> > >> > > > > >> > plans
> > >> > > > > >> > > >> this
> > >> > > > > >> > > >> > > > year.
> > >> > > > > >> > > >> > > > > > > > Presuming we are going to have one by
> > the
> > >> end
> > >> > > of
> > >> > > > > the
> > >> > > > > >> > > year, it
> > >> > > > > >> > > >> > > > seems a
> > >> > > > > >> > > >> > > > > > > > bit close to still be adding in
> > "features
> > >> > that
> > >> > > > need
> > >> > > > > >> > > maturing"
> > >> > > > > >> > > >> > on
> > >> > > > > >> > > >> > > > the
> > >> > > > > >> > > >> > > > > > > > branch.
> > >> > > > > >> > > >> > > > > > > >
> > >> > > > > >> > > >> > > > > > > > The lack of a concrete plan for 2.0
> > keeps
> > >> me
> > >> > > from
> > >> > > > > >> > > considering
> > >> > > > > >> > > >> > > these
> > >> > > > > >> > > >> > > > > > > > things blocker at the moment. But I
> know
> > >> > first
> > >> > > > hand
> > >> > > > > >> how
> > >> > > > > >> > > much
> > >> > > > > >> > > >> > > > trouble
> > >> > > > > >> > > >> > > > > > > > folks have had with other features
> that
> > >> have
> > >> > > gone
> > >> > > > > >> into
> > >> > > > > >> > > >> > downstream
> > >> > > > > >> > > >> > > > > > > > facing releases without robustness
> > checks
> > >> > (i.e.
> > >> > > > > >> > > replication),
> > >> > > > > >> > > >> > and
> > >> > > > > >> > > >> > > > I'm
> > >> > > > > >> > > >> > > > > > > > concerned about what we're setting up
> if
> > >> 2.0
> > >> > > goes
> > >> > > > > out
> > >> > > > > >> > with
> > >> > > > > >> > > >> this
> > >> > > > > >> > > >> > > > > > > > feature in its current state.
> > >> > > > > >> > > >> > > > > > > >
> > >> > > > > >> > > >> > > > > > >
> > >> > > > > >> > > >> > > > > >
> > >> > > > > >> > > >> > > > >
> > >> > > > > >> > > >> > > > >
> > >> > > > > >> > > >> > > > >
> > >> > > > > >> > > >> > > > > --
> > >> > > > > >> > > >> > > > > Best regards,
> > >> > > > > >> > > >> > > > >
> > >> > > > > >> > > >> > > > >    - Andy
> > >> > > > > >> > > >> > > > >
> > >> > > > > >> > > >> > > > > Problems worthy of attack prove their worth
> by
> > >> > > hitting
> > >> > > > > >> back. -
> > >> > > > > >> > > Piet
> > >> > > > > >> > > >> > > Hein
> > >> > > > > >> > > >> > > > > (via Tom White)
> > >> > > > > >> > > >> > > > >
> > >> > > > > >> > > >> > > >
> > >> > > > > >> > > >> > >
> > >> > > > > >> > > >> > >
> > >> > > > > >> > > >> > >
> > >> > > > > >> > > >> > > --
> > >> > > > > >> > > >> > > Best regards,
> > >> > > > > >> > > >> > >
> > >> > > > > >> > > >> > >    - Andy
> > >> > > > > >> > > >> > >
> > >> > > > > >> > > >> > > Problems worthy of attack prove their worth by
> > >> hitting
> > >> > > > back.
> > >> > > > > -
> > >> > > > > >> > Piet
> > >> > > > > >> > > >> Hein
> > >> > > > > >> > > >> > > (via Tom White)
> > >> > > > > >> > > >> > >
> > >> > > > > >> > > >> >
> > >> > > > > >> > > >>
> > >> > > > > >> > >
> > >> > > > > >> >
> > >> > > > > >>
> > >> > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> >
> >
> > --
> > busbey
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message