hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: [DISCUSS] More new feature backports to 0.94.
Date Sat, 02 Mar 2013 20:46:47 GMT
In the end, I think, boils down to the established process.
Anybody can open a jira and propose a patch. If it gets +1's from a few committers and no
-1's we should commit it.
As I said on HBASE-7965, if we cannot convince Jon and Elliot that this is safe to do, we
should not do it (either because Enis and I agree, or because Jon -1's it). No hard feeling
either way, I hope (none from my side at least).


It seems we're mostly in agreement and just differ a bit in what constitutes a feature vs.
a bug fix.

-- Lars



________________________________
 From: Jonathan Hsieh <jon@cloudera.com>
To: dev@hbase.apache.org 
Cc: lars hofhansl <larsh@apache.org> 
Sent: Saturday, March 2, 2013 8:26 AM
Subject: Re: [DISCUSS] More new feature backports to 0.94.
 

To be clear, a key point is that unit testing is a required but not sufficient.  I need anecdotes
about system testing with at least some unexpected fault handling and stress.  If the feature
is actively being developed still, go into a dev branch (git hub or svn) that eventually merges.
 Some info about perf would be nice as well if that is affected.

In cases that aren't too burdensome, I would prefer consecutive individual commits to a
stable branch as opposed to a single mega patch.  This of course is a case-by-case decision.
(snapshots is about 80 patches.. way too burdensome).


Jon.


On Sat, Mar 2, 2013 at 8:14 AM, Ted Yu <yuzhihong@gmail.com> wrote:

bq. I would want to see this feature come in as a big bang -- get it
>
>complete enough in trunk before backporting the pieces to a stable branch.
>
>I agree with Jon on this point.
>Porting in one big patch allows us to think through related use cases.
>Another benefit is that there wouldn't be glitch in API, in case the first
>batch of backports went into 0.94.x and the second batch goes into 0.94.x+1
>Running the feature through test suite in trunk continuously gives us time
>to discover defects before the backport.
>
>Cheers
>
>
>On Sat, Mar 2, 2013 at 7:36 AM, Jonathan Hsieh <jon@cloudera.com> wrote:
>
>> In general, I have a preference against backporting  features for the
>> reasons that Enis, Elliott, and Jean-Marc consider valid.  To be clear,
>> this preference doesn't mean I am -1 to all backports onto the stable
>> apache branch.  Let's do it case-by-case; my main ask is to make major
>> backports rare and to make it the norm to require significantly more
>> evidence of testing than usual.  I will -1 a major backport that lacks this
>> evidence.  This will come up again in the future.
>>
>> With the cases Lars proposed -- I prefer #3 (just say no) but find #1 (be
>> very careful) acceptable given higher level of evidence.  #2 (new release
>> branch) is onerous -- I'd rather we just get preview-release branches out
>> more frequently to not have deal with this.  Arguably, the reason we have
>> the preview-release branches serve the purpose of getting releases out more
>> frequently and giving a feature time to harden from a few common points.
>>  My hope is that these preview release will replace what were the 0.x.0 and
>> 0.x.1 releases from  previous versions
>>
>> So what kind of evidence would I like to see? We can use snapshots case as
>> an example.
>>
>> When backporting snapshots was brought up, I actually preferred that we not
>> backport that feature.  There was demand, so we agreed that we'd do it but
>> no backport it until it is "rock solid".  Here's evidence to support the
>> case that the feature and backport is solid:
>> * It's code history is publicly documented and has been available since
>> December.
>> * It's design documentation has been available for even longer.
>> * The feature is mostly additive and doesn't affect vital paths.
>> * It was tested against trunk and the later tested against a 0.94 variant
>> that is closer to the target apache branch.
>> * The version in the trunk branch has been reviewed by 5 committers.
>> * Limitations are either documented (please let me know if we should
>> improve it more) or non-critical.
>> * Testing and hardening anecdotes have been documented in the original and
>> backport jira.  There has been some relatively long term testing and fault
>> injection testing (roughly 4-6 weeks).
>> * It will be backported in a "big bang" -- all pieces get added or none
>> will.
>>
>> This is a level I consider to be stronger than the normal testing expected
>> for a patch.  Ideally, something at least this level is what I would expect
>> for other major backports.  Do we agree on that?
>>
>> For the table locks case, there maybe some of this may be a misperception
>> in timing from my point of view.  I see a notification about this in jira
>> which makes me think it is more imminent.   Looking into it, I see that
>> currently the development and application of the zk table lock feature
>> isn't complete -- the mechanism is committed but it isn't applied and
>> integrated into all the operations (split, assign etc still on the way).
>>  I've asked for documentation and Enis has graciously added a great design
>> doc that will help reviewers understand it.  I'd love to be able to spend
>> time system testing to really beating it up or at least have anecdotes from
>> folks about their efforts on the apache verison.  Finally, I would want to
>> see this feature come in as a big bang -- get it complete enough in trunk
>> before backporting the pieces to a stable branch.
>>
>> I haven't invested time into the online merge backport decision but my
>> instinct there is to not port the feature as well.  It is less risky since
>> it is an additive feature but has less reward since we already have a
>> less-friendly-but-comparable mechanism.  Since merge seems similar to split
>> (which took a while to get right) testing its correctness in failure cases
>> at the system level would be a prereq.
>>
>> Jon.
>>
>> On Sat, Mar 2, 2013 at 3:43 AM, Nicolas Liochon <nkeywal@gmail.com> wrote:
>>
>> > New feature is a red herring imho: To me the only question is the
>> > regression risk.. And a feature can have a much lower regression risk
>> than
>> > a bug fix. I guess we've all seen a fix for a non critical bug putting
>> down
>> > a production system. Being able to backport features is a competitive
>> > advantage that leverages on a good architecture and a good test suite.
>> > Maintaining a branch adds a cost for everybody: if you have a bug to fix
>> in
>> > 94.6.1, you need to fix it in 0.94.7 as well. So we should do it only if
>> we
>> > really have to, and plan it accordingly (i.e. we should not have to
>> create
>> > a 0.94.7.1 a week after the creation of the 0.94.6.1).
>> >
>> > In the future, the test suite should also help us to estimate and
>> minimize
>> > the risk. We're not there yet, but having a good test coverage is key for
>> > version 1 imho.
>> >
>> > So that makes me +1 for backport, and  0 for branching (+1 if there is a
>> > good reason and a plan, but here it's a theoretical discussion, so,...
>> ;-)
>> > )
>> >
>> > Nicolas
>> >
>> >
>> > On Sat, Mar 2, 2013 at 4:44 AM, lars hofhansl <larsh@apache.org> wrote:
>> >
>> > > I did mean "stablizing". What I was trying to point is that stuff we
>> > > backport might stabilize HBase.
>> > >
>> > >
>> > >
>> > > ________________________________
>> > >  From: Ted Yu <yuzhihong@gmail.com>
>> > > To: dev@hbase.apache.org; lars hofhansl <larsh@apache.org>
>> > > Sent: Friday, March 1, 2013 7:30 PM
>> > > Subject: Re: [DISCUSS] More new feature backports to 0.94.
>> > >
>> > > bq. That is only if we do not backport stabilizing "features".
>> > > Did you mean destabilizing above :-)
>> > >
>> > > My preference is option #1. With option #2, the community would be
>> > dealing
>> > > with one more branch which would increase the amount of work validating
>> > > each release candidate.
>> > >
>> > > To me, the difference between option #2 and the upcoming release
>> > candidates
>> > > of 0.95 would blur.
>> > >
>> > > Cheers
>> > >
>> > > On Fri, Mar 1, 2013 at 7:24 PM, lars hofhansl <larsh@apache.org>
>> wrote:
>> > >
>> > > > That is only if we do not backport stabilizing "features". There is
>> an
>> > > > "opportunity cost" to be paid if we take a too rigorous approach too.
>> > > >
>> > > > Take
>> > > >  for example table-locks (which prompted this discussion). With that
>> in
>> > > > place we can do safe online schema changes (that won't fail and leave
>> > > > the table in an undefined state when a concurrent split happens),
it
>> > > > also allows for online merge.
>> > > >
>> > > > Now, is that a destabilizing
>> > > > "feature", or will it make HBase more stable and hence is an
>> > > > "improvement"? Depends on viewpoint, doesn't it?
>> > > > -- Lars
>> > > >
>> > > >
>> > > > ________________________________
>> > > >  From: Jean-Marc Spaggiari <jean-marc@spaggiari.org>
>> > > > To: dev@hbase.apache.org
>> > > > Sent: Friday, March 1, 2013 7:12 PM
>> > > > Subject: Re: [DISCUSS] More new feature backports to 0.94.
>> > > >
>> > > > @Lars: No, not any concern about anything already backported. Just
a
>> > > > preference to #2 because it seems to make things more stable and
>> > > > easier to manage. New feature = new release. Given new sub-releases
>> > > > are for fixes and improvements, but not new features. Also, if we
>> > > > backport a feature in one or many previous releases, we will have
to
>> > > > backport also all the fixes each time there will be an issue. So we
>> > > > will have more maintenant work on previous releases.
>> > > >
>> > > > 2013/3/1 Enis Söztutar <enis.soz@gmail.com>:
>> > > > > I think the current way of risk vs rewards analysis is working
>> well.
>> > We
>> > > > > will just continue doing that on a case by case basis, discussing
>> the
>> > > > > implications on individual issues.
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Fri, Mar 1, 2013 at 6:46 PM, Lars Hofhansl <lhofhansl@yahoo.com
>> >
>> > > > wrote:
>> > > > >
>> > > > >> BTW are you concerned about any specific back port we did
in the
>> > past?
>> > > > So
>> > > > >> far we have not seen any destabilization in any of the 0.94
>> > releases.
>> > > > >>
>> > > > >> Jean-Marc Spaggiari <jean-marc@spaggiari.org> wrote:
>> > > > >>
>> > > > >> >Hi Lars, #2, does it mean you will stop back-porting
the new
>> > features
>> > > > >> >when it will become a "long-term" release? If so, I'm
for option
>> > > #2...
>> > > > >> >
>> > > > >> >JM
>> > > > >> >
>> > > > >> >In your option
>> > > > >> >2013/3/1 Enis Söztutar <enis.soz@gmail.com>:
>> > > > >> >> Thanks Lars, I think it is a good listing of the
options we
>> have.
>> > > > >> >>
>> > > > >> >> I'll be +1 for #1 and #2, with #1 being a preference.
>> > > > >> >>
>> > > > >> >> Enis
>> > > > >> >>
>> > > > >> >>
>> > > > >> >> On Fri, Mar 1, 2013 at 6:10 PM, lars hofhansl <
>> larsh@apache.org>
>> > > > wrote:
>> > > > >> >>
>> > > > >> >>> So it seems that until we have a stable 0.96
(maybe 0.96.1 or
>> > > > 0.96.2)
>> > > > >> we
>> > > > >> >>> have three options:
>> > > > >> >>> 1. Backport new features to 0.94 as we see fit
as long as we
>> do
>> > > not
>> > > > >> >>> destabilize 0.94.
>> > > > >> >>> 2. Declare a certain point release (0.94.6 looks
like a good
>> > > > >> candidate) as
>> > > > >> >>> a "long term", create an 0.94.6 branch (in addition
to the
>> usual
>> > > > 0.94.6
>> > > > >> >>> tag) and than create 0.94.6.x fix only releases.
I would
>> > volunteer
>> > > > to
>> > > > >> >>> maintain a 0.94.6 branch in addition to the
0.94 branch.
>> > > > >> >>> 3. Categorically do not backport new features
into 0.94 and
>> > defer
>> > > to
>> > > > >> 0.95.
>> > > > >> >>>
>> > > > >> >>> I'd be +1 on option #1 and #2, and -1 on option
#3.
>> > > > >> >>>
>> > > > >> >>> -- Lars
>> > > > >> >>>
>> > > > >> >>>
>> > > > >> >>>
>> > > > >> >>> ________________________________
>> > > > >> >>>  From: Jonathan Hsieh <jon@cloudera.com>
>> > > > >> >>> To: dev@hbase.apache.org; lars hofhansl <larsh@apache.org>
>> > > > >> >>> Sent: Friday, March 1, 2013 3:11 PM
>> > > > >> >>> Subject: Re: [DISCUSS] More new feature backports
to 0.94.
>> > > > >> >>>
>> > > > >> >>> I think we are basically agreeing -- my primary
concern is
>> > > bringing
>> > > > new
>> > > > >> >>> features in vital paths introduces more risk,
I'd rather not
>> > > > backport
>> > > > >> major
>> > > > >> >>> new features unless we achieve a higher level
of assurance
>> > through
>> > > > >> system
>> > > > >> >>> and basic fault injection testing.
>> > > > >> >>>
>> > > > >> >>> For the three current examples -- snapshots,
zk table locks,
>> > > online
>> > > > >> merge
>> > > > >> >>> -- I actually would prefer not including any
in apache 0.94.
>>  Of
>> > > the
>> > > > >> bunch,
>> > > > >> >>> I feel the table locks are the most risky since
it affects
>> vital
>> > > > paths
>> > > > >> a
>> > > > >> >>> user must use,  where as snapshots and online
merge are
>> features
>> > > > that a
>> > > > >> >>> user could choose to use but does not necessarily
have to use.
>> > > I'll
>> > > > >> voice
>> > > > >> >>> my concerns, reason for concerns, and justifications
on the
>> > > > individual
>> > > > >> >>> jiras.
>> > > > >> >>>
>> > > > >> >>> I do feel that new features being in a dev/preview
release
>> like
>> > > 0.95
>> > > > >> aligns
>> > > > >> >>> well and doesn't create situations where different
versions
>> have
>> > > > >> different
>> > > > >> >>> feature sets.  New features should be introduced
and hardened
>> > in a
>> > > > >> >>> dev/preview version, and the turn into the production
ready
>> > > versions
>> > > > >> after
>> > > > >> >>> they've been proven out a bit.
>> > > > >> >>>
>> > > > >> >>> Jon.
>> > > > >> >>>
>> > > > >> >>> On Fri, Mar 1, 2013 at 11:00 AM, lars hofhansl
<
>> > larsh@apache.org>
>> > > > >> wrote:
>> > > > >> >>>
>> > > > >> >>> > This is an open source project, as long
as there is a
>> > volunteer
>> > > to
>> > > > >> >>> > backport a patch I see no problem with
doing this.
>> > > > >> >>> > The only thing we as the community should
ensure is that it
>> > must
>> > > > be
>> > > > >> >>> > demonstrated that the patch does not destabilize
the 0.94
>> code
>> > > > base;
>> > > > >> that
>> > > > >> >>> > has to be done on a case by case basis.
>> > > > >> >>> >
>> > > > >> >>> >
>> > > > >> >>> > Also, there is no stable release of HBase
other than 0.94
>> > (0.95
>> > > is
>> > > > >> not
>> > > > >> >>> > stable, and we specifically state that
it should not be used
>> > in
>> > > > >> >>> production).
>> > > > >> >>> >
>> > > > >> >>> > -- Lars
>> > > > >> >>> >
>> > > > >> >>> >
>> > > > >> >>> >
>> > > > >> >>> > ________________________________
>> > > > >> >>> >  From: Jonathan Hsieh <jon@cloudera.com>
>> > > > >> >>> > To: dev@hbase.apache.org
>> > > > >> >>> > Sent: Friday, March 1, 2013 8:31 AM
>> > > > >> >>> > Subject: [DISCUSS] More new feature backports
to 0.94.
>> > > > >> >>> >
>> > > > >> >>> > I was thinking more about HBASE-7360 (backport
snapshots to
>> > > 0.94)
>> > > > and
>> > > > >> >>> also
>> > > > >> >>> > saw HBASE-7965 which suggests porting some
major-ish
>> features
>> > > > (table
>> > > > >> >>> locks,
>> > > > >> >>> > online merge) in to the apache 0.94 line.
  We should chat
>> > about
>> > > > >> what we
>> > > > >> >>> > want to do about new features and bringing
them into stable
>> > > > versions
>> > > > >> >>> (0.94
>> > > > >> >>> > today) and in general criteria we use for
future versions.
>> > > > >> >>> >
>> > > > >> >>> > This is similar to the snapshots backport
discussion and
>> > earlier
>> > > > >> backport
>> > > > >> >>> > discussions.  Here's my understanding
of  high level points
>> we
>> > > > >> basically
>> > > > >> >>> > agree upon.
>> > > > >> >>> > * Backporting new features to the previous
major version
>> > incurs
>> > > > more
>> > > > >> cost
>> > > > >> >>> > when developing new features,  pushes
back efforts on making
>> > the
>> > > > >> trunk
>> > > > >> >>> > versions and reduces incentive to move
to newer versions.
>> > > > >> >>> > * Backporting new features to earlier versions
(0.9x.0,
>> > 0.9x.1)
>> > > is
>> > > > >> >>> > reasonable since they are generally less
stable.
>> > > > >> >>> > * Backporting new features to later version
(0.9x.5, 0.9x.6)
>> > is
>> > > > less
>> > > > >> >>> > reasonable --  (ex: a 0.94.6, or 0.94.7
should only include
>> > > robust
>> > > > >> >>> > features).
>> > > > >> >>> > * Backporting orthogonal features (snapshots)
seems less
>> risky
>> > > > than
>> > > > >> core
>> > > > >> >>> > changing features
>> > > > >> >>> > * An except: If multiple distributions
declare intent to
>> > > > backport, it
>> > > > >> >>> makes
>> > > > >> >>> > sense to backport a feature. (snapshots
for example).
>> > > > >> >>> >
>> > > > >> >>> > Some new circumstances and discussion topics:
>> > > > >> >>> > * We now have a dev branch (0.95) with
looser compat
>> > > requirements
>> > > > >> that we
>> > > > >> >>> > could more readily release with dev/preview
versions.
>> >  Shouldn't
>> > > > this
>> > > > >> >>> > reduce the need to backport features to
the apache stable
>> > > > branches?
>> > > > >> >>> Would
>> > > > >> >>> > releases of these releases "replace" the
0.x.0 or 0.x.1
>> > > releases?
>> > > > >> >>> > * For major features in later versions
we should raise the
>> bar
>> > > on
>> > > > the
>> > > > >> >>> > amount of testing probably be more explicit
about what
>> testing
>> > > is
>> > > > >> done
>> > > > >> >>> > (unit tests not suffcient, system testing
stories/resports a
>> > > > >> >>> requirement).
>> > > > >> >>> > Any other suggestions?
>> > > > >> >>> >
>> > > > >> >>> > Jon.
>> > > > >> >>> >
>> > > > >> >>> > --
>> > > > >> >>> > // Jonathan Hsieh (shay)
>> > > > >> >>> > // Software Engineer, Cloudera
>> > > > >> >>> > // jon@cloudera.com
>> > > > >> >>> >
>> > > > >> >>>
>> > > > >> >>>
>> > > > >> >>>
>> > > > >> >>> --
>> > > > >> >>> // Jonathan Hsieh (shay)
>> > > > >> >>> // Software Engineer, Cloudera
>> > > > >> >>> // jon@cloudera.com
>> > > > >> >>>
>> > > > >>
>> > > >
>> > >
>> >
>>
>>
>>
>> --
>> // Jonathan Hsieh (shay)
>> // Software Engineer, Cloudera
>> // jon@cloudera.com
>>
>


-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera

// jon@cloudera.com
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message