hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Hsieh <...@cloudera.com>
Subject Re: [DISCUSS] More new feature backports to 0.94.
Date Sat, 02 Mar 2013 15:36:10 GMT
In general, I have a preference against backporting  features for the
reasons that Enis, Elliott, and Jean-Marc consider valid.  To be clear,
this preference doesn't mean I am -1 to all backports onto the stable
apache branch.  Let's do it case-by-case; my main ask is to make major
backports rare and to make it the norm to require significantly more
evidence of testing than usual.  I will -1 a major backport that lacks this
evidence.  This will come up again in the future.

With the cases Lars proposed -- I prefer #3 (just say no) but find #1 (be
very careful) acceptable given higher level of evidence.  #2 (new release
branch) is onerous -- I'd rather we just get preview-release branches out
more frequently to not have deal with this.  Arguably, the reason we have
the preview-release branches serve the purpose of getting releases out more
frequently and giving a feature time to harden from a few common points.
 My hope is that these preview release will replace what were the 0.x.0 and
0.x.1 releases from  previous versions

So what kind of evidence would I like to see? We can use snapshots case as
an example.

When backporting snapshots was brought up, I actually preferred that we not
backport that feature.  There was demand, so we agreed that we'd do it but
no backport it until it is "rock solid".  Here's evidence to support the
case that the feature and backport is solid:
* It's code history is publicly documented and has been available since
December.
* It's design documentation has been available for even longer.
* The feature is mostly additive and doesn't affect vital paths.
* It was tested against trunk and the later tested against a 0.94 variant
that is closer to the target apache branch.
* The version in the trunk branch has been reviewed by 5 committers.
* Limitations are either documented (please let me know if we should
improve it more) or non-critical.
* Testing and hardening anecdotes have been documented in the original and
backport jira.  There has been some relatively long term testing and fault
injection testing (roughly 4-6 weeks).
* It will be backported in a "big bang" -- all pieces get added or none
will.

This is a level I consider to be stronger than the normal testing expected
for a patch.  Ideally, something at least this level is what I would expect
for other major backports.  Do we agree on that?

For the table locks case, there maybe some of this may be a misperception
in timing from my point of view.  I see a notification about this in jira
which makes me think it is more imminent.   Looking into it, I see that
currently the development and application of the zk table lock feature
isn't complete -- the mechanism is committed but it isn't applied and
integrated into all the operations (split, assign etc still on the way).
 I've asked for documentation and Enis has graciously added a great design
doc that will help reviewers understand it.  I'd love to be able to spend
time system testing to really beating it up or at least have anecdotes from
folks about their efforts on the apache verison.  Finally, I would want to
see this feature come in as a big bang -- get it complete enough in trunk
before backporting the pieces to a stable branch.

I haven't invested time into the online merge backport decision but my
instinct there is to not port the feature as well.  It is less risky since
it is an additive feature but has less reward since we already have a
less-friendly-but-comparable mechanism.  Since merge seems similar to split
(which took a while to get right) testing its correctness in failure cases
at the system level would be a prereq.

Jon.

On Sat, Mar 2, 2013 at 3:43 AM, Nicolas Liochon <nkeywal@gmail.com> wrote:

> New feature is a red herring imho: To me the only question is the
> regression risk.. And a feature can have a much lower regression risk than
> a bug fix. I guess we've all seen a fix for a non critical bug putting down
> a production system. Being able to backport features is a competitive
> advantage that leverages on a good architecture and a good test suite.
> Maintaining a branch adds a cost for everybody: if you have a bug to fix in
> 94.6.1, you need to fix it in 0.94.7 as well. So we should do it only if we
> really have to, and plan it accordingly (i.e. we should not have to create
> a 0.94.7.1 a week after the creation of the 0.94.6.1).
>
> In the future, the test suite should also help us to estimate and minimize
> the risk. We're not there yet, but having a good test coverage is key for
> version 1 imho.
>
> So that makes me +1 for backport, and  0 for branching (+1 if there is a
> good reason and a plan, but here it's a theoretical discussion, so,... ;-)
> )
>
> Nicolas
>
>
> On Sat, Mar 2, 2013 at 4:44 AM, lars hofhansl <larsh@apache.org> wrote:
>
> > I did mean "stablizing". What I was trying to point is that stuff we
> > backport might stabilize HBase.
> >
> >
> >
> > ________________________________
> >  From: Ted Yu <yuzhihong@gmail.com>
> > To: dev@hbase.apache.org; lars hofhansl <larsh@apache.org>
> > Sent: Friday, March 1, 2013 7:30 PM
> > Subject: Re: [DISCUSS] More new feature backports to 0.94.
> >
> > bq. That is only if we do not backport stabilizing "features".
> > Did you mean destabilizing above :-)
> >
> > My preference is option #1. With option #2, the community would be
> dealing
> > with one more branch which would increase the amount of work validating
> > each release candidate.
> >
> > To me, the difference between option #2 and the upcoming release
> candidates
> > of 0.95 would blur.
> >
> > Cheers
> >
> > On Fri, Mar 1, 2013 at 7:24 PM, lars hofhansl <larsh@apache.org> wrote:
> >
> > > That is only if we do not backport stabilizing "features". There is an
> > > "opportunity cost" to be paid if we take a too rigorous approach too.
> > >
> > > Take
> > >  for example table-locks (which prompted this discussion). With that in
> > > place we can do safe online schema changes (that won't fail and leave
> > > the table in an undefined state when a concurrent split happens), it
> > > also allows for online merge.
> > >
> > > Now, is that a destabilizing
> > > "feature", or will it make HBase more stable and hence is an
> > > "improvement"? Depends on viewpoint, doesn't it?
> > > -- Lars
> > >
> > >
> > > ________________________________
> > >  From: Jean-Marc Spaggiari <jean-marc@spaggiari.org>
> > > To: dev@hbase.apache.org
> > > Sent: Friday, March 1, 2013 7:12 PM
> > > Subject: Re: [DISCUSS] More new feature backports to 0.94.
> > >
> > > @Lars: No, not any concern about anything already backported. Just a
> > > preference to #2 because it seems to make things more stable and
> > > easier to manage. New feature = new release. Given new sub-releases
> > > are for fixes and improvements, but not new features. Also, if we
> > > backport a feature in one or many previous releases, we will have to
> > > backport also all the fixes each time there will be an issue. So we
> > > will have more maintenant work on previous releases.
> > >
> > > 2013/3/1 Enis Söztutar <enis.soz@gmail.com>:
> > > > I think the current way of risk vs rewards analysis is working well.
> We
> > > > will just continue doing that on a case by case basis, discussing the
> > > > implications on individual issues.
> > > >
> > > >
> > > >
> > > > On Fri, Mar 1, 2013 at 6:46 PM, Lars Hofhansl <lhofhansl@yahoo.com>
> > > wrote:
> > > >
> > > >> BTW are you concerned about any specific back port we did in the
> past?
> > > So
> > > >> far we have not seen any destabilization in any of the 0.94
> releases.
> > > >>
> > > >> Jean-Marc Spaggiari <jean-marc@spaggiari.org> wrote:
> > > >>
> > > >> >Hi Lars, #2, does it mean you will stop back-porting the new
> features
> > > >> >when it will become a "long-term" release? If so, I'm for option
> > #2...
> > > >> >
> > > >> >JM
> > > >> >
> > > >> >In your option
> > > >> >2013/3/1 Enis Söztutar <enis.soz@gmail.com>:
> > > >> >> Thanks Lars, I think it is a good listing of the options
we have.
> > > >> >>
> > > >> >> I'll be +1 for #1 and #2, with #1 being a preference.
> > > >> >>
> > > >> >> Enis
> > > >> >>
> > > >> >>
> > > >> >> On Fri, Mar 1, 2013 at 6:10 PM, lars hofhansl <larsh@apache.org>
> > > wrote:
> > > >> >>
> > > >> >>> So it seems that until we have a stable 0.96 (maybe 0.96.1
or
> > > 0.96.2)
> > > >> we
> > > >> >>> have three options:
> > > >> >>> 1. Backport new features to 0.94 as we see fit as long
as we do
> > not
> > > >> >>> destabilize 0.94.
> > > >> >>> 2. Declare a certain point release (0.94.6 looks like
a good
> > > >> candidate) as
> > > >> >>> a "long term", create an 0.94.6 branch (in addition to
the usual
> > > 0.94.6
> > > >> >>> tag) and than create 0.94.6.x fix only releases. I would
> volunteer
> > > to
> > > >> >>> maintain a 0.94.6 branch in addition to the 0.94 branch.
> > > >> >>> 3. Categorically do not backport new features into 0.94
and
> defer
> > to
> > > >> 0.95.
> > > >> >>>
> > > >> >>> I'd be +1 on option #1 and #2, and -1 on option #3.
> > > >> >>>
> > > >> >>> -- Lars
> > > >> >>>
> > > >> >>>
> > > >> >>>
> > > >> >>> ________________________________
> > > >> >>>  From: Jonathan Hsieh <jon@cloudera.com>
> > > >> >>> To: dev@hbase.apache.org; lars hofhansl <larsh@apache.org>
> > > >> >>> Sent: Friday, March 1, 2013 3:11 PM
> > > >> >>> Subject: Re: [DISCUSS] More new feature backports to
0.94.
> > > >> >>>
> > > >> >>> I think we are basically agreeing -- my primary concern
is
> > bringing
> > > new
> > > >> >>> features in vital paths introduces more risk, I'd rather
not
> > > backport
> > > >> major
> > > >> >>> new features unless we achieve a higher level of assurance
> through
> > > >> system
> > > >> >>> and basic fault injection testing.
> > > >> >>>
> > > >> >>> For the three current examples -- snapshots, zk table
locks,
> > online
> > > >> merge
> > > >> >>> -- I actually would prefer not including any in apache
0.94.  Of
> > the
> > > >> bunch,
> > > >> >>> I feel the table locks are the most risky since it affects
vital
> > > paths
> > > >> a
> > > >> >>> user must use,  where as snapshots and online merge are
features
> > > that a
> > > >> >>> user could choose to use but does not necessarily have
to use.
> > I'll
> > > >> voice
> > > >> >>> my concerns, reason for concerns, and justifications
on the
> > > individual
> > > >> >>> jiras.
> > > >> >>>
> > > >> >>> I do feel that new features being in a dev/preview release
like
> > 0.95
> > > >> aligns
> > > >> >>> well and doesn't create situations where different versions
have
> > > >> different
> > > >> >>> feature sets.  New features should be introduced and
hardened
> in a
> > > >> >>> dev/preview version, and the turn into the production
ready
> > versions
> > > >> after
> > > >> >>> they've been proven out a bit.
> > > >> >>>
> > > >> >>> Jon.
> > > >> >>>
> > > >> >>> On Fri, Mar 1, 2013 at 11:00 AM, lars hofhansl <
> larsh@apache.org>
> > > >> wrote:
> > > >> >>>
> > > >> >>> > This is an open source project, as long as there
is a
> volunteer
> > to
> > > >> >>> > backport a patch I see no problem with doing this.
> > > >> >>> > The only thing we as the community should ensure
is that it
> must
> > > be
> > > >> >>> > demonstrated that the patch does not destabilize
the 0.94 code
> > > base;
> > > >> that
> > > >> >>> > has to be done on a case by case basis.
> > > >> >>> >
> > > >> >>> >
> > > >> >>> > Also, there is no stable release of HBase other
than 0.94
> (0.95
> > is
> > > >> not
> > > >> >>> > stable, and we specifically state that it should
not be used
> in
> > > >> >>> production).
> > > >> >>> >
> > > >> >>> > -- Lars
> > > >> >>> >
> > > >> >>> >
> > > >> >>> >
> > > >> >>> > ________________________________
> > > >> >>> >  From: Jonathan Hsieh <jon@cloudera.com>
> > > >> >>> > To: dev@hbase.apache.org
> > > >> >>> > Sent: Friday, March 1, 2013 8:31 AM
> > > >> >>> > Subject: [DISCUSS] More new feature backports to
0.94.
> > > >> >>> >
> > > >> >>> > I was thinking more about HBASE-7360 (backport snapshots
to
> > 0.94)
> > > and
> > > >> >>> also
> > > >> >>> > saw HBASE-7965 which suggests porting some major-ish
features
> > > (table
> > > >> >>> locks,
> > > >> >>> > online merge) in to the apache 0.94 line.   We should
chat
> about
> > > >> what we
> > > >> >>> > want to do about new features and bringing them
into stable
> > > versions
> > > >> >>> (0.94
> > > >> >>> > today) and in general criteria we use for future
versions.
> > > >> >>> >
> > > >> >>> > This is similar to the snapshots backport discussion
and
> earlier
> > > >> backport
> > > >> >>> > discussions.  Here's my understanding of  high level
points we
> > > >> basically
> > > >> >>> > agree upon.
> > > >> >>> > * Backporting new features to the previous major
version
> incurs
> > > more
> > > >> cost
> > > >> >>> > when developing new features,  pushes back efforts
on making
> the
> > > >> trunk
> > > >> >>> > versions and reduces incentive to move to newer
versions.
> > > >> >>> > * Backporting new features to earlier versions (0.9x.0,
> 0.9x.1)
> > is
> > > >> >>> > reasonable since they are generally less stable.
> > > >> >>> > * Backporting new features to later version (0.9x.5,
0.9x.6)
> is
> > > less
> > > >> >>> > reasonable --  (ex: a 0.94.6, or 0.94.7 should only
include
> > robust
> > > >> >>> > features).
> > > >> >>> > * Backporting orthogonal features (snapshots) seems
less risky
> > > than
> > > >> core
> > > >> >>> > changing features
> > > >> >>> > * An except: If multiple distributions declare intent
to
> > > backport, it
> > > >> >>> makes
> > > >> >>> > sense to backport a feature. (snapshots for example).
> > > >> >>> >
> > > >> >>> > Some new circumstances and discussion topics:
> > > >> >>> > * We now have a dev branch (0.95) with looser compat
> > requirements
> > > >> that we
> > > >> >>> > could more readily release with dev/preview versions.
>  Shouldn't
> > > this
> > > >> >>> > reduce the need to backport features to the apache
stable
> > > branches?
> > > >> >>> Would
> > > >> >>> > releases of these releases "replace" the 0.x.0 or
0.x.1
> > releases?
> > > >> >>> > * For major features in later versions we should
raise the bar
> > on
> > > the
> > > >> >>> > amount of testing probably be more explicit about
what testing
> > is
> > > >> done
> > > >> >>> > (unit tests not suffcient, system testing stories/resports
a
> > > >> >>> requirement).
> > > >> >>> > Any other suggestions?
> > > >> >>> >
> > > >> >>> > Jon.
> > > >> >>> >
> > > >> >>> > --
> > > >> >>> > // Jonathan Hsieh (shay)
> > > >> >>> > // Software Engineer, Cloudera
> > > >> >>> > // jon@cloudera.com
> > > >> >>> >
> > > >> >>>
> > > >> >>>
> > > >> >>>
> > > >> >>> --
> > > >> >>> // Jonathan Hsieh (shay)
> > > >> >>> // Software Engineer, Cloudera
> > > >> >>> // jon@cloudera.com
> > > >> >>>
> > > >>
> > >
> >
>



-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message