hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Hsieh <...@cloudera.com>
Subject Re: [DISCUSS] More new feature backports to 0.94.
Date Sat, 02 Mar 2013 21:49:32 GMT
I don't think it is a debate about feature vs bug fix -- I've been trying
to make a general case about major feature backports.   I agree that we are
basically on the same page for the general case.    I've been using some of
the current candidate features as examples but I'm really trying to focus
on defining a general "finished" condition early-on for big backports
(bullet points that would be highlights of the next release), and to
express the need for higher scrutiny on these commits.  I'll post specific
details for each proposal in the appropriate jira.

In the specific case, I actually do want the table locks, but only after
they are done and have some evidence of stability.   I would much rather
have known bugs with known workarounds instead of unknown issues introduced
by a backported feature, and would like to avoid hackery introduced by
compatibility bugs fire drills.

The points I'm trying to make about 0.95.x is that ideally it is where the
new features get further hardened (as opposed to the stable branch).
 Ideally the release manager for that version will start gate keeping what
new major features/changes make it in there so that we have a chance of
releasing it and a 0.96 sometime soon. :)

Jon.

On Sat, Mar 2, 2013 at 12:46 PM, lars hofhansl <larsh@apache.org> wrote:

> In the end, I think, boils down to the established process.
> Anybody can open a jira and propose a patch. If it gets +1's from a few
> committers and no -1's we should commit it.
> As I said on HBASE-7965, if we cannot convince Jon and Elliot that this is
> safe to do, we should not do it (either because Enis and I agree, or
> because Jon -1's it). No hard feeling either way, I hope (none from my side
> at least).
>
>
> It seems we're mostly in agreement and just differ a bit in what
> constitutes a feature vs. a bug fix.
>
> -- Lars
>
>
>
> ________________________________
>  From: Jonathan Hsieh <jon@cloudera.com>
> To: dev@hbase.apache.org
> Cc: lars hofhansl <larsh@apache.org>
> Sent: Saturday, March 2, 2013 8:26 AM
> Subject: Re: [DISCUSS] More new feature backports to 0.94.
>
>
> To be clear, a key point is that unit testing is a required but not
> sufficient.  I need anecdotes about system testing with at least some
> unexpected fault handling and stress.  If the feature is actively being
> developed still, go into a dev branch (git hub or svn) that eventually
> merges.  Some info about perf would be nice as well if that is affected.
>
> In cases that aren't too burdensome, I would prefer consecutive individual
> commits to a stable branch as opposed to a single mega patch.  This of
> course is a case-by-case decision. (snapshots is about 80 patches.. way too
> burdensome).
>
>
> Jon.
>
>
> On Sat, Mar 2, 2013 at 8:14 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> bq. I would want to see this feature come in as a big bang -- get it
> >
> >complete enough in trunk before backporting the pieces to a stable branch.
> >
> >I agree with Jon on this point.
> >Porting in one big patch allows us to think through related use cases.
> >Another benefit is that there wouldn't be glitch in API, in case the first
> >batch of backports went into 0.94.x and the second batch goes into
> 0.94.x+1
> >Running the feature through test suite in trunk continuously gives us time
> >to discover defects before the backport.
> >
> >Cheers
> >
> >
> >On Sat, Mar 2, 2013 at 7:36 AM, Jonathan Hsieh <jon@cloudera.com> wrote:
> >
> >> In general, I have a preference against backporting  features for the
> >> reasons that Enis, Elliott, and Jean-Marc consider valid.  To be clear,
> >> this preference doesn't mean I am -1 to all backports onto the stable
> >> apache branch.  Let's do it case-by-case; my main ask is to make major
> >> backports rare and to make it the norm to require significantly more
> >> evidence of testing than usual.  I will -1 a major backport that lacks
> this
> >> evidence.  This will come up again in the future.
> >>
> >> With the cases Lars proposed -- I prefer #3 (just say no) but find #1
> (be
> >> very careful) acceptable given higher level of evidence.  #2 (new
> release
> >> branch) is onerous -- I'd rather we just get preview-release branches
> out
> >> more frequently to not have deal with this.  Arguably, the reason we
> have
> >> the preview-release branches serve the purpose of getting releases out
> more
> >> frequently and giving a feature time to harden from a few common points.
> >>  My hope is that these preview release will replace what were the 0.x.0
> and
> >> 0.x.1 releases from  previous versions
> >>
> >> So what kind of evidence would I like to see? We can use snapshots case
> as
> >> an example.
> >>
> >> When backporting snapshots was brought up, I actually preferred that we
> not
> >> backport that feature.  There was demand, so we agreed that we'd do it
> but
> >> no backport it until it is "rock solid".  Here's evidence to support the
> >> case that the feature and backport is solid:
> >> * It's code history is publicly documented and has been available since
> >> December.
> >> * It's design documentation has been available for even longer.
> >> * The feature is mostly additive and doesn't affect vital paths.
> >> * It was tested against trunk and the later tested against a 0.94
> variant
> >> that is closer to the target apache branch.
> >> * The version in the trunk branch has been reviewed by 5 committers.
> >> * Limitations are either documented (please let me know if we should
> >> improve it more) or non-critical.
> >> * Testing and hardening anecdotes have been documented in the original
> and
> >> backport jira.  There has been some relatively long term testing and
> fault
> >> injection testing (roughly 4-6 weeks).
> >> * It will be backported in a "big bang" -- all pieces get added or none
> >> will.
> >>
> >> This is a level I consider to be stronger than the normal testing
> expected
> >> for a patch.  Ideally, something at least this level is what I would
> expect
> >> for other major backports.  Do we agree on that?
> >>
> >> For the table locks case, there maybe some of this may be a
> misperception
> >> in timing from my point of view.  I see a notification about this in
> jira
> >> which makes me think it is more imminent.   Looking into it, I see that
> >> currently the development and application of the zk table lock feature
> >> isn't complete -- the mechanism is committed but it isn't applied and
> >> integrated into all the operations (split, assign etc still on the way).
> >>  I've asked for documentation and Enis has graciously added a great
> design
> >> doc that will help reviewers understand it.  I'd love to be able to
> spend
> >> time system testing to really beating it up or at least have anecdotes
> from
> >> folks about their efforts on the apache verison.  Finally, I would want
> to
> >> see this feature come in as a big bang -- get it complete enough in
> trunk
> >> before backporting the pieces to a stable branch.
> >>
> >> I haven't invested time into the online merge backport decision but my
> >> instinct there is to not port the feature as well.  It is less risky
> since
> >> it is an additive feature but has less reward since we already have a
> >> less-friendly-but-comparable mechanism.  Since merge seems similar to
> split
> >> (which took a while to get right) testing its correctness in failure
> cases
> >> at the system level would be a prereq.
> >>
> >> Jon.
> >>
> >> On Sat, Mar 2, 2013 at 3:43 AM, Nicolas Liochon <nkeywal@gmail.com>
> wrote:
> >>
> >> > New feature is a red herring imho: To me the only question is the
> >> > regression risk.. And a feature can have a much lower regression risk
> >> than
> >> > a bug fix. I guess we've all seen a fix for a non critical bug putting
> >> down
> >> > a production system. Being able to backport features is a competitive
> >> > advantage that leverages on a good architecture and a good test suite.
> >> > Maintaining a branch adds a cost for everybody: if you have a bug to
> fix
> >> in
> >> > 94.6.1, you need to fix it in 0.94.7 as well. So we should do it only
> if
> >> we
> >> > really have to, and plan it accordingly (i.e. we should not have to
> >> create
> >> > a 0.94.7.1 a week after the creation of the 0.94.6.1).
> >> >
> >> > In the future, the test suite should also help us to estimate and
> >> minimize
> >> > the risk. We're not there yet, but having a good test coverage is key
> for
> >> > version 1 imho.
> >> >
> >> > So that makes me +1 for backport, and  0 for branching (+1 if there
> is a
> >> > good reason and a plan, but here it's a theoretical discussion, so,...
> >> ;-)
> >> > )
> >> >
> >> > Nicolas
> >> >
> >> >
> >> > On Sat, Mar 2, 2013 at 4:44 AM, lars hofhansl <larsh@apache.org>
> wrote:
> >> >
> >> > > I did mean "stablizing". What I was trying to point is that stuff
we
> >> > > backport might stabilize HBase.
> >> > >
> >> > >
> >> > >
> >> > > ________________________________
> >> > >  From: Ted Yu <yuzhihong@gmail.com>
> >> > > To: dev@hbase.apache.org; lars hofhansl <larsh@apache.org>
> >> > > Sent: Friday, March 1, 2013 7:30 PM
> >> > > Subject: Re: [DISCUSS] More new feature backports to 0.94.
> >> > >
> >> > > bq. That is only if we do not backport stabilizing "features".
> >> > > Did you mean destabilizing above :-)
> >> > >
> >> > > My preference is option #1. With option #2, the community would be
> >> > dealing
> >> > > with one more branch which would increase the amount of work
> validating
> >> > > each release candidate.
> >> > >
> >> > > To me, the difference between option #2 and the upcoming release
> >> > candidates
> >> > > of 0.95 would blur.
> >> > >
> >> > > Cheers
> >> > >
> >> > > On Fri, Mar 1, 2013 at 7:24 PM, lars hofhansl <larsh@apache.org>
> >> wrote:
> >> > >
> >> > > > That is only if we do not backport stabilizing "features". There
> is
> >> an
> >> > > > "opportunity cost" to be paid if we take a too rigorous approach
> too.
> >> > > >
> >> > > > Take
> >> > > >  for example table-locks (which prompted this discussion). With
> that
> >> in
> >> > > > place we can do safe online schema changes (that won't fail and
> leave
> >> > > > the table in an undefined state when a concurrent split happens),
> it
> >> > > > also allows for online merge.
> >> > > >
> >> > > > Now, is that a destabilizing
> >> > > > "feature", or will it make HBase more stable and hence is an
> >> > > > "improvement"? Depends on viewpoint, doesn't it?
> >> > > > -- Lars
> >> > > >
> >> > > >
> >> > > > ________________________________
> >> > > >  From: Jean-Marc Spaggiari <jean-marc@spaggiari.org>
> >> > > > To: dev@hbase.apache.org
> >> > > > Sent: Friday, March 1, 2013 7:12 PM
> >> > > > Subject: Re: [DISCUSS] More new feature backports to 0.94.
> >> > > >
> >> > > > @Lars: No, not any concern about anything already backported.
> Just a
> >> > > > preference to #2 because it seems to make things more stable
and
> >> > > > easier to manage. New feature = new release. Given new
> sub-releases
> >> > > > are for fixes and improvements, but not new features. Also, if
we
> >> > > > backport a feature in one or many previous releases, we will
have
> to
> >> > > > backport also all the fixes each time there will be an issue.
So
> we
> >> > > > will have more maintenant work on previous releases.
> >> > > >
> >> > > > 2013/3/1 Enis Söztutar <enis.soz@gmail.com>:
> >> > > > > I think the current way of risk vs rewards analysis is working
> >> well.
> >> > We
> >> > > > > will just continue doing that on a case by case basis,
> discussing
> >> the
> >> > > > > implications on individual issues.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > On Fri, Mar 1, 2013 at 6:46 PM, Lars Hofhansl <
> lhofhansl@yahoo.com
> >> >
> >> > > > wrote:
> >> > > > >
> >> > > > >> BTW are you concerned about any specific back port we
did in
> the
> >> > past?
> >> > > > So
> >> > > > >> far we have not seen any destabilization in any of the
0.94
> >> > releases.
> >> > > > >>
> >> > > > >> Jean-Marc Spaggiari <jean-marc@spaggiari.org>
wrote:
> >> > > > >>
> >> > > > >> >Hi Lars, #2, does it mean you will stop back-porting
the new
> >> > features
> >> > > > >> >when it will become a "long-term" release? If so,
I'm for
> option
> >> > > #2...
> >> > > > >> >
> >> > > > >> >JM
> >> > > > >> >
> >> > > > >> >In your option
> >> > > > >> >2013/3/1 Enis Söztutar <enis.soz@gmail.com>:
> >> > > > >> >> Thanks Lars, I think it is a good listing of
the options we
> >> have.
> >> > > > >> >>
> >> > > > >> >> I'll be +1 for #1 and #2, with #1 being a preference.
> >> > > > >> >>
> >> > > > >> >> Enis
> >> > > > >> >>
> >> > > > >> >>
> >> > > > >> >> On Fri, Mar 1, 2013 at 6:10 PM, lars hofhansl
<
> >> larsh@apache.org>
> >> > > > wrote:
> >> > > > >> >>
> >> > > > >> >>> So it seems that until we have a stable
0.96 (maybe 0.96.1
> or
> >> > > > 0.96.2)
> >> > > > >> we
> >> > > > >> >>> have three options:
> >> > > > >> >>> 1. Backport new features to 0.94 as we
see fit as long as
> we
> >> do
> >> > > not
> >> > > > >> >>> destabilize 0.94.
> >> > > > >> >>> 2. Declare a certain point release (0.94.6
looks like a
> good
> >> > > > >> candidate) as
> >> > > > >> >>> a "long term", create an 0.94.6 branch
(in addition to the
> >> usual
> >> > > > 0.94.6
> >> > > > >> >>> tag) and than create 0.94.6.x fix only
releases. I would
> >> > volunteer
> >> > > > to
> >> > > > >> >>> maintain a 0.94.6 branch in addition to
the 0.94 branch.
> >> > > > >> >>> 3. Categorically do not backport new features
into 0.94 and
> >> > defer
> >> > > to
> >> > > > >> 0.95.
> >> > > > >> >>>
> >> > > > >> >>> I'd be +1 on option #1 and #2, and -1 on
option #3.
> >> > > > >> >>>
> >> > > > >> >>> -- Lars
> >> > > > >> >>>
> >> > > > >> >>>
> >> > > > >> >>>
> >> > > > >> >>> ________________________________
> >> > > > >> >>>  From: Jonathan Hsieh <jon@cloudera.com>
> >> > > > >> >>> To: dev@hbase.apache.org; lars hofhansl
<larsh@apache.org>
> >> > > > >> >>> Sent: Friday, March 1, 2013 3:11 PM
> >> > > > >> >>> Subject: Re: [DISCUSS] More new feature
backports to 0.94.
> >> > > > >> >>>
> >> > > > >> >>> I think we are basically agreeing -- my
primary concern is
> >> > > bringing
> >> > > > new
> >> > > > >> >>> features in vital paths introduces more
risk, I'd rather
> not
> >> > > > backport
> >> > > > >> major
> >> > > > >> >>> new features unless we achieve a higher
level of assurance
> >> > through
> >> > > > >> system
> >> > > > >> >>> and basic fault injection testing.
> >> > > > >> >>>
> >> > > > >> >>> For the three current examples -- snapshots,
zk table
> locks,
> >> > > online
> >> > > > >> merge
> >> > > > >> >>> -- I actually would prefer not including
any in apache
> 0.94.
> >>  Of
> >> > > the
> >> > > > >> bunch,
> >> > > > >> >>> I feel the table locks are the most risky
since it affects
> >> vital
> >> > > > paths
> >> > > > >> a
> >> > > > >> >>> user must use,  where as snapshots and
online merge are
> >> features
> >> > > > that a
> >> > > > >> >>> user could choose to use but does not necessarily
have to
> use.
> >> > > I'll
> >> > > > >> voice
> >> > > > >> >>> my concerns, reason for concerns, and justifications
on the
> >> > > > individual
> >> > > > >> >>> jiras.
> >> > > > >> >>>
> >> > > > >> >>> I do feel that new features being in a
dev/preview release
> >> like
> >> > > 0.95
> >> > > > >> aligns
> >> > > > >> >>> well and doesn't create situations where
different versions
> >> have
> >> > > > >> different
> >> > > > >> >>> feature sets.  New features should be introduced
and
> hardened
> >> > in a
> >> > > > >> >>> dev/preview version, and the turn into
the production ready
> >> > > versions
> >> > > > >> after
> >> > > > >> >>> they've been proven out a bit.
> >> > > > >> >>>
> >> > > > >> >>> Jon.
> >> > > > >> >>>
> >> > > > >> >>> On Fri, Mar 1, 2013 at 11:00 AM, lars hofhansl
<
> >> > larsh@apache.org>
> >> > > > >> wrote:
> >> > > > >> >>>
> >> > > > >> >>> > This is an open source project, as
long as there is a
> >> > volunteer
> >> > > to
> >> > > > >> >>> > backport a patch I see no problem
with doing this.
> >> > > > >> >>> > The only thing we as the community
should ensure is that
> it
> >> > must
> >> > > > be
> >> > > > >> >>> > demonstrated that the patch does not
destabilize the 0.94
> >> code
> >> > > > base;
> >> > > > >> that
> >> > > > >> >>> > has to be done on a case by case basis.
> >> > > > >> >>> >
> >> > > > >> >>> >
> >> > > > >> >>> > Also, there is no stable release of
HBase other than 0.94
> >> > (0.95
> >> > > is
> >> > > > >> not
> >> > > > >> >>> > stable, and we specifically state
that it should not be
> used
> >> > in
> >> > > > >> >>> production).
> >> > > > >> >>> >
> >> > > > >> >>> > -- Lars
> >> > > > >> >>> >
> >> > > > >> >>> >
> >> > > > >> >>> >
> >> > > > >> >>> > ________________________________
> >> > > > >> >>> >  From: Jonathan Hsieh <jon@cloudera.com>
> >> > > > >> >>> > To: dev@hbase.apache.org
> >> > > > >> >>> > Sent: Friday, March 1, 2013 8:31 AM
> >> > > > >> >>> > Subject: [DISCUSS] More new feature
backports to 0.94.
> >> > > > >> >>> >
> >> > > > >> >>> > I was thinking more about HBASE-7360
(backport snapshots
> to
> >> > > 0.94)
> >> > > > and
> >> > > > >> >>> also
> >> > > > >> >>> > saw HBASE-7965 which suggests porting
some major-ish
> >> features
> >> > > > (table
> >> > > > >> >>> locks,
> >> > > > >> >>> > online merge) in to the apache 0.94
line.   We should
> chat
> >> > about
> >> > > > >> what we
> >> > > > >> >>> > want to do about new features and
bringing them into
> stable
> >> > > > versions
> >> > > > >> >>> (0.94
> >> > > > >> >>> > today) and in general criteria we
use for future
> versions.
> >> > > > >> >>> >
> >> > > > >> >>> > This is similar to the snapshots backport
discussion and
> >> > earlier
> >> > > > >> backport
> >> > > > >> >>> > discussions.  Here's my understanding
of  high level
> points
> >> we
> >> > > > >> basically
> >> > > > >> >>> > agree upon.
> >> > > > >> >>> > * Backporting new features to the
previous major version
> >> > incurs
> >> > > > more
> >> > > > >> cost
> >> > > > >> >>> > when developing new features,  pushes
back efforts on
> making
> >> > the
> >> > > > >> trunk
> >> > > > >> >>> > versions and reduces incentive to
move to newer versions.
> >> > > > >> >>> > * Backporting new features to earlier
versions (0.9x.0,
> >> > 0.9x.1)
> >> > > is
> >> > > > >> >>> > reasonable since they are generally
less stable.
> >> > > > >> >>> > * Backporting new features to later
version (0.9x.5,
> 0.9x.6)
> >> > is
> >> > > > less
> >> > > > >> >>> > reasonable --  (ex: a 0.94.6, or 0.94.7
should only
> include
> >> > > robust
> >> > > > >> >>> > features).
> >> > > > >> >>> > * Backporting orthogonal features
(snapshots) seems less
> >> risky
> >> > > > than
> >> > > > >> core
> >> > > > >> >>> > changing features
> >> > > > >> >>> > * An except: If multiple distributions
declare intent to
> >> > > > backport, it
> >> > > > >> >>> makes
> >> > > > >> >>> > sense to backport a feature. (snapshots
for example).
> >> > > > >> >>> >
> >> > > > >> >>> > Some new circumstances and discussion
topics:
> >> > > > >> >>> > * We now have a dev branch (0.95)
with looser compat
> >> > > requirements
> >> > > > >> that we
> >> > > > >> >>> > could more readily release with dev/preview
versions.
> >> >  Shouldn't
> >> > > > this
> >> > > > >> >>> > reduce the need to backport features
to the apache stable
> >> > > > branches?
> >> > > > >> >>> Would
> >> > > > >> >>> > releases of these releases "replace"
the 0.x.0 or 0.x.1
> >> > > releases?
> >> > > > >> >>> > * For major features in later versions
we should raise
> the
> >> bar
> >> > > on
> >> > > > the
> >> > > > >> >>> > amount of testing probably be more
explicit about what
> >> testing
> >> > > is
> >> > > > >> done
> >> > > > >> >>> > (unit tests not suffcient, system
testing
> stories/resports a
> >> > > > >> >>> requirement).
> >> > > > >> >>> > Any other suggestions?
> >> > > > >> >>> >
> >> > > > >> >>> > Jon.
> >> > > > >> >>> >
> >> > > > >> >>> > --
> >> > > > >> >>> > // Jonathan Hsieh (shay)
> >> > > > >> >>> > // Software Engineer, Cloudera
> >> > > > >> >>> > // jon@cloudera.com
> >> > > > >> >>> >
> >> > > > >> >>>
> >> > > > >> >>>
> >> > > > >> >>>
> >> > > > >> >>> --
> >> > > > >> >>> // Jonathan Hsieh (shay)
> >> > > > >> >>> // Software Engineer, Cloudera
> >> > > > >> >>> // jon@cloudera.com
> >> > > > >> >>>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> // Jonathan Hsieh (shay)
> >> // Software Engineer, Cloudera
> >> // jon@cloudera.com
> >>
> >
>
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
>
> // jon@cloudera.com
>



-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message