Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 64EAFE49B for ; Sat, 2 Mar 2013 21:50:19 +0000 (UTC) Received: (qmail 87873 invoked by uid 500); 2 Mar 2013 21:50:18 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 87799 invoked by uid 500); 2 Mar 2013 21:50:18 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 87790 invoked by uid 99); 2 Mar 2013 21:50:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 02 Mar 2013 21:50:18 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jon@cloudera.com designates 209.85.128.171 as permitted sender) Received: from [209.85.128.171] (HELO mail-ve0-f171.google.com) (209.85.128.171) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 02 Mar 2013 21:50:13 +0000 Received: by mail-ve0-f171.google.com with SMTP id b10so3815797vea.2 for ; Sat, 02 Mar 2013 13:49:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type:x-gm-message-state; bh=z+sZeT7gAPiFC+W/zuGOuc7vaQaNcRvV2vvtUH0aHB0=; b=Q+FVbEf+RMFe1RbEtxWpuXhfxmXMIaILLAgJHR+TrgTFzyvmR26hd5sGsRoAf8DMHW yvf9pEdMtPk3XpUhVka3NtS0zwWawD28aXz41ig/ddLTRrbh4TTwkGdLnMcvtJOZc1u1 k2N+WE/MxBLxBhO5y30Y97mItFjfudkG8xp4pXabzkGFx0AKTjKZyd3bVo2yPWPFtRc3 IA0S7SvWjfnIPiXHbnfyCBw1XqZ2oysVQgGU461a4dkfnNLWRGkJMTqyN28HYZfcleKR hnOK8MeFYs8I6jwCe4noCsUHR38P3DDiAu7DRsHa2+FkXG7ak1Sw2yI6fe06FPkor4d2 AEgw== X-Received: by 10.52.65.147 with SMTP id x19mr5095773vds.27.1362260992386; Sat, 02 Mar 2013 13:49:52 -0800 (PST) MIME-Version: 1.0 Received: by 10.59.7.129 with HTTP; Sat, 2 Mar 2013 13:49:32 -0800 (PST) In-Reply-To: <1362257207.720.YahooMailNeo@web140604.mail.bf1.yahoo.com> References: <089rbhqknjjsoly0263snv3o.1362192404331@email.android.com> <1362194653.99724.YahooMailNeo@web140603.mail.bf1.yahoo.com> <1362195843.90166.YahooMailNeo@web140602.mail.bf1.yahoo.com> <1362257207.720.YahooMailNeo@web140604.mail.bf1.yahoo.com> From: Jonathan Hsieh Date: Sat, 2 Mar 2013 13:49:32 -0800 Message-ID: Subject: Re: [DISCUSS] More new feature backports to 0.94. To: dev@hbase.apache.org, lars hofhansl Content-Type: multipart/alternative; boundary=20cf3071ca048b026604d6f81a75 X-Gm-Message-State: ALoCoQnGUmhJD3kAXj6kFHe6D1jIHEDcqt3DuPSoS12o405o6WIdq7iASCUws8SOHnvkuRRQdock X-Virus-Checked: Checked by ClamAV on apache.org --20cf3071ca048b026604d6f81a75 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I don't think it is a debate about feature vs bug fix -- I've been trying to make a general case about major feature backports. I agree that we are basically on the same page for the general case. I've been using some of the current candidate features as examples but I'm really trying to focus on defining a general "finished" condition early-on for big backports (bullet points that would be highlights of the next release), and to express the need for higher scrutiny on these commits. I'll post specific details for each proposal in the appropriate jira. In the specific case, I actually do want the table locks, but only after they are done and have some evidence of stability. I would much rather have known bugs with known workarounds instead of unknown issues introduced by a backported feature, and would like to avoid hackery introduced by compatibility bugs fire drills. The points I'm trying to make about 0.95.x is that ideally it is where the new features get further hardened (as opposed to the stable branch). Ideally the release manager for that version will start gate keeping what new major features/changes make it in there so that we have a chance of releasing it and a 0.96 sometime soon. :) Jon. On Sat, Mar 2, 2013 at 12:46 PM, lars hofhansl wrote: > In the end, I think, boils down to the established process. > Anybody can open a jira and propose a patch. If it gets +1's from a few > committers and no -1's we should commit it. > As I said on HBASE-7965, if we cannot convince Jon and Elliot that this i= s > safe to do, we should not do it (either because Enis and I agree, or > because Jon -1's it). No hard feeling either way, I hope (none from my si= de > at least). > > > It seems we're mostly in agreement and just differ a bit in what > constitutes a feature vs. a bug fix. > > -- Lars > > > > ________________________________ > From: Jonathan Hsieh > To: dev@hbase.apache.org > Cc: lars hofhansl > Sent: Saturday, March 2, 2013 8:26 AM > Subject: Re: [DISCUSS] More new feature backports to 0.94. > > > To be clear, a key point is that unit testing is a required but not > sufficient. I need anecdotes about system testing with at least some > unexpected fault handling and stress. If the feature is actively being > developed still, go into a dev branch (git hub or svn) that eventually > merges. Some info about perf would be nice as well if that is affected. > > In cases that aren't too burdensome, I would prefer consecutive individua= l > commits to a stable branch as opposed to a single mega patch. This of > course is a case-by-case decision. (snapshots is about 80 patches.. way t= oo > burdensome). > > > Jon. > > > On Sat, Mar 2, 2013 at 8:14 AM, Ted Yu wrote: > > bq. I would want to see this feature come in as a big bang -- get it > > > >complete enough in trunk before backporting the pieces to a stable branc= h. > > > >I agree with Jon on this point. > >Porting in one big patch allows us to think through related use cases. > >Another benefit is that there wouldn't be glitch in API, in case the fir= st > >batch of backports went into 0.94.x and the second batch goes into > 0.94.x+1 > >Running the feature through test suite in trunk continuously gives us ti= me > >to discover defects before the backport. > > > >Cheers > > > > > >On Sat, Mar 2, 2013 at 7:36 AM, Jonathan Hsieh wrote: > > > >> In general, I have a preference against backporting features for the > >> reasons that Enis, Elliott, and Jean-Marc consider valid. To be clear= , > >> this preference doesn't mean I am -1 to all backports onto the stable > >> apache branch. Let's do it case-by-case; my main ask is to make major > >> backports rare and to make it the norm to require significantly more > >> evidence of testing than usual. I will -1 a major backport that lacks > this > >> evidence. This will come up again in the future. > >> > >> With the cases Lars proposed -- I prefer #3 (just say no) but find #1 > (be > >> very careful) acceptable given higher level of evidence. #2 (new > release > >> branch) is onerous -- I'd rather we just get preview-release branches > out > >> more frequently to not have deal with this. Arguably, the reason we > have > >> the preview-release branches serve the purpose of getting releases out > more > >> frequently and giving a feature time to harden from a few common point= s. > >> My hope is that these preview release will replace what were the 0.x.= 0 > and > >> 0.x.1 releases from previous versions > >> > >> So what kind of evidence would I like to see? We can use snapshots cas= e > as > >> an example. > >> > >> When backporting snapshots was brought up, I actually preferred that w= e > not > >> backport that feature. There was demand, so we agreed that we'd do it > but > >> no backport it until it is "rock solid". Here's evidence to support t= he > >> case that the feature and backport is solid: > >> * It's code history is publicly documented and has been available sinc= e > >> December. > >> * It's design documentation has been available for even longer. > >> * The feature is mostly additive and doesn't affect vital paths. > >> * It was tested against trunk and the later tested against a 0.94 > variant > >> that is closer to the target apache branch. > >> * The version in the trunk branch has been reviewed by 5 committers. > >> * Limitations are either documented (please let me know if we should > >> improve it more) or non-critical. > >> * Testing and hardening anecdotes have been documented in the original > and > >> backport jira. There has been some relatively long term testing and > fault > >> injection testing (roughly 4-6 weeks). > >> * It will be backported in a "big bang" -- all pieces get added or non= e > >> will. > >> > >> This is a level I consider to be stronger than the normal testing > expected > >> for a patch. Ideally, something at least this level is what I would > expect > >> for other major backports. Do we agree on that? > >> > >> For the table locks case, there maybe some of this may be a > misperception > >> in timing from my point of view. I see a notification about this in > jira > >> which makes me think it is more imminent. Looking into it, I see tha= t > >> currently the development and application of the zk table lock feature > >> isn't complete -- the mechanism is committed but it isn't applied and > >> integrated into all the operations (split, assign etc still on the way= ). > >> I've asked for documentation and Enis has graciously added a great > design > >> doc that will help reviewers understand it. I'd love to be able to > spend > >> time system testing to really beating it up or at least have anecdotes > from > >> folks about their efforts on the apache verison. Finally, I would wan= t > to > >> see this feature come in as a big bang -- get it complete enough in > trunk > >> before backporting the pieces to a stable branch. > >> > >> I haven't invested time into the online merge backport decision but my > >> instinct there is to not port the feature as well. It is less risky > since > >> it is an additive feature but has less reward since we already have a > >> less-friendly-but-comparable mechanism. Since merge seems similar to > split > >> (which took a while to get right) testing its correctness in failure > cases > >> at the system level would be a prereq. > >> > >> Jon. > >> > >> On Sat, Mar 2, 2013 at 3:43 AM, Nicolas Liochon > wrote: > >> > >> > New feature is a red herring imho: To me the only question is the > >> > regression risk.. And a feature can have a much lower regression ris= k > >> than > >> > a bug fix. I guess we've all seen a fix for a non critical bug putti= ng > >> down > >> > a production system. Being able to backport features is a competitiv= e > >> > advantage that leverages on a good architecture and a good test suit= e. > >> > Maintaining a branch adds a cost for everybody: if you have a bug to > fix > >> in > >> > 94.6.1, you need to fix it in 0.94.7 as well. So we should do it onl= y > if > >> we > >> > really have to, and plan it accordingly (i.e. we should not have to > >> create > >> > a 0.94.7.1 a week after the creation of the 0.94.6.1). > >> > > >> > In the future, the test suite should also help us to estimate and > >> minimize > >> > the risk. We're not there yet, but having a good test coverage is ke= y > for > >> > version 1 imho. > >> > > >> > So that makes me +1 for backport, and 0 for branching (+1 if there > is a > >> > good reason and a plan, but here it's a theoretical discussion, so,.= .. > >> ;-) > >> > ) > >> > > >> > Nicolas > >> > > >> > > >> > On Sat, Mar 2, 2013 at 4:44 AM, lars hofhansl > wrote: > >> > > >> > > I did mean "stablizing". What I was trying to point is that stuff = we > >> > > backport might stabilize HBase. > >> > > > >> > > > >> > > > >> > > ________________________________ > >> > > From: Ted Yu > >> > > To: dev@hbase.apache.org; lars hofhansl > >> > > Sent: Friday, March 1, 2013 7:30 PM > >> > > Subject: Re: [DISCUSS] More new feature backports to 0.94. > >> > > > >> > > bq. That is only if we do not backport stabilizing "features". > >> > > Did you mean destabilizing above :-) > >> > > > >> > > My preference is option #1. With option #2, the community would be > >> > dealing > >> > > with one more branch which would increase the amount of work > validating > >> > > each release candidate. > >> > > > >> > > To me, the difference between option #2 and the upcoming release > >> > candidates > >> > > of 0.95 would blur. > >> > > > >> > > Cheers > >> > > > >> > > On Fri, Mar 1, 2013 at 7:24 PM, lars hofhansl > >> wrote: > >> > > > >> > > > That is only if we do not backport stabilizing "features". There > is > >> an > >> > > > "opportunity cost" to be paid if we take a too rigorous approach > too. > >> > > > > >> > > > Take > >> > > > for example table-locks (which prompted this discussion). With > that > >> in > >> > > > place we can do safe online schema changes (that won't fail and > leave > >> > > > the table in an undefined state when a concurrent split happens)= , > it > >> > > > also allows for online merge. > >> > > > > >> > > > Now, is that a destabilizing > >> > > > "feature", or will it make HBase more stable and hence is an > >> > > > "improvement"? Depends on viewpoint, doesn't it? > >> > > > -- Lars > >> > > > > >> > > > > >> > > > ________________________________ > >> > > > From: Jean-Marc Spaggiari > >> > > > To: dev@hbase.apache.org > >> > > > Sent: Friday, March 1, 2013 7:12 PM > >> > > > Subject: Re: [DISCUSS] More new feature backports to 0.94. > >> > > > > >> > > > @Lars: No, not any concern about anything already backported. > Just a > >> > > > preference to #2 because it seems to make things more stable and > >> > > > easier to manage. New feature =3D new release. Given new > sub-releases > >> > > > are for fixes and improvements, but not new features. Also, if w= e > >> > > > backport a feature in one or many previous releases, we will hav= e > to > >> > > > backport also all the fixes each time there will be an issue. So > we > >> > > > will have more maintenant work on previous releases. > >> > > > > >> > > > 2013/3/1 Enis S=F6ztutar : > >> > > > > I think the current way of risk vs rewards analysis is working > >> well. > >> > We > >> > > > > will just continue doing that on a case by case basis, > discussing > >> the > >> > > > > implications on individual issues. > >> > > > > > >> > > > > > >> > > > > > >> > > > > On Fri, Mar 1, 2013 at 6:46 PM, Lars Hofhansl < > lhofhansl@yahoo.com > >> > > >> > > > wrote: > >> > > > > > >> > > > >> BTW are you concerned about any specific back port we did in > the > >> > past? > >> > > > So > >> > > > >> far we have not seen any destabilization in any of the 0.94 > >> > releases. > >> > > > >> > >> > > > >> Jean-Marc Spaggiari wrote: > >> > > > >> > >> > > > >> >Hi Lars, #2, does it mean you will stop back-porting the new > >> > features > >> > > > >> >when it will become a "long-term" release? If so, I'm for > option > >> > > #2... > >> > > > >> > > >> > > > >> >JM > >> > > > >> > > >> > > > >> >In your option > >> > > > >> >2013/3/1 Enis S=F6ztutar : > >> > > > >> >> Thanks Lars, I think it is a good listing of the options w= e > >> have. > >> > > > >> >> > >> > > > >> >> I'll be +1 for #1 and #2, with #1 being a preference. > >> > > > >> >> > >> > > > >> >> Enis > >> > > > >> >> > >> > > > >> >> > >> > > > >> >> On Fri, Mar 1, 2013 at 6:10 PM, lars hofhansl < > >> larsh@apache.org> > >> > > > wrote: > >> > > > >> >> > >> > > > >> >>> So it seems that until we have a stable 0.96 (maybe 0.96.= 1 > or > >> > > > 0.96.2) > >> > > > >> we > >> > > > >> >>> have three options: > >> > > > >> >>> 1. Backport new features to 0.94 as we see fit as long as > we > >> do > >> > > not > >> > > > >> >>> destabilize 0.94. > >> > > > >> >>> 2. Declare a certain point release (0.94.6 looks like a > good > >> > > > >> candidate) as > >> > > > >> >>> a "long term", create an 0.94.6 branch (in addition to th= e > >> usual > >> > > > 0.94.6 > >> > > > >> >>> tag) and than create 0.94.6.x fix only releases. I would > >> > volunteer > >> > > > to > >> > > > >> >>> maintain a 0.94.6 branch in addition to the 0.94 branch. > >> > > > >> >>> 3. Categorically do not backport new features into 0.94 a= nd > >> > defer > >> > > to > >> > > > >> 0.95. > >> > > > >> >>> > >> > > > >> >>> I'd be +1 on option #1 and #2, and -1 on option #3. > >> > > > >> >>> > >> > > > >> >>> -- Lars > >> > > > >> >>> > >> > > > >> >>> > >> > > > >> >>> > >> > > > >> >>> ________________________________ > >> > > > >> >>> From: Jonathan Hsieh > >> > > > >> >>> To: dev@hbase.apache.org; lars hofhansl > >> > > > >> >>> Sent: Friday, March 1, 2013 3:11 PM > >> > > > >> >>> Subject: Re: [DISCUSS] More new feature backports to 0.94= . > >> > > > >> >>> > >> > > > >> >>> I think we are basically agreeing -- my primary concern i= s > >> > > bringing > >> > > > new > >> > > > >> >>> features in vital paths introduces more risk, I'd rather > not > >> > > > backport > >> > > > >> major > >> > > > >> >>> new features unless we achieve a higher level of assuranc= e > >> > through > >> > > > >> system > >> > > > >> >>> and basic fault injection testing. > >> > > > >> >>> > >> > > > >> >>> For the three current examples -- snapshots, zk table > locks, > >> > > online > >> > > > >> merge > >> > > > >> >>> -- I actually would prefer not including any in apache > 0.94. > >> Of > >> > > the > >> > > > >> bunch, > >> > > > >> >>> I feel the table locks are the most risky since it affect= s > >> vital > >> > > > paths > >> > > > >> a > >> > > > >> >>> user must use, where as snapshots and online merge are > >> features > >> > > > that a > >> > > > >> >>> user could choose to use but does not necessarily have to > use. > >> > > I'll > >> > > > >> voice > >> > > > >> >>> my concerns, reason for concerns, and justifications on t= he > >> > > > individual > >> > > > >> >>> jiras. > >> > > > >> >>> > >> > > > >> >>> I do feel that new features being in a dev/preview releas= e > >> like > >> > > 0.95 > >> > > > >> aligns > >> > > > >> >>> well and doesn't create situations where different versio= ns > >> have > >> > > > >> different > >> > > > >> >>> feature sets. New features should be introduced and > hardened > >> > in a > >> > > > >> >>> dev/preview version, and the turn into the production rea= dy > >> > > versions > >> > > > >> after > >> > > > >> >>> they've been proven out a bit. > >> > > > >> >>> > >> > > > >> >>> Jon. > >> > > > >> >>> > >> > > > >> >>> On Fri, Mar 1, 2013 at 11:00 AM, lars hofhansl < > >> > larsh@apache.org> > >> > > > >> wrote: > >> > > > >> >>> > >> > > > >> >>> > This is an open source project, as long as there is a > >> > volunteer > >> > > to > >> > > > >> >>> > backport a patch I see no problem with doing this. > >> > > > >> >>> > The only thing we as the community should ensure is tha= t > it > >> > must > >> > > > be > >> > > > >> >>> > demonstrated that the patch does not destabilize the 0.= 94 > >> code > >> > > > base; > >> > > > >> that > >> > > > >> >>> > has to be done on a case by case basis. > >> > > > >> >>> > > >> > > > >> >>> > > >> > > > >> >>> > Also, there is no stable release of HBase other than 0.= 94 > >> > (0.95 > >> > > is > >> > > > >> not > >> > > > >> >>> > stable, and we specifically state that it should not be > used > >> > in > >> > > > >> >>> production). > >> > > > >> >>> > > >> > > > >> >>> > -- Lars > >> > > > >> >>> > > >> > > > >> >>> > > >> > > > >> >>> > > >> > > > >> >>> > ________________________________ > >> > > > >> >>> > From: Jonathan Hsieh > >> > > > >> >>> > To: dev@hbase.apache.org > >> > > > >> >>> > Sent: Friday, March 1, 2013 8:31 AM > >> > > > >> >>> > Subject: [DISCUSS] More new feature backports to 0.94. > >> > > > >> >>> > > >> > > > >> >>> > I was thinking more about HBASE-7360 (backport snapshot= s > to > >> > > 0.94) > >> > > > and > >> > > > >> >>> also > >> > > > >> >>> > saw HBASE-7965 which suggests porting some major-ish > >> features > >> > > > (table > >> > > > >> >>> locks, > >> > > > >> >>> > online merge) in to the apache 0.94 line. We should > chat > >> > about > >> > > > >> what we > >> > > > >> >>> > want to do about new features and bringing them into > stable > >> > > > versions > >> > > > >> >>> (0.94 > >> > > > >> >>> > today) and in general criteria we use for future > versions. > >> > > > >> >>> > > >> > > > >> >>> > This is similar to the snapshots backport discussion an= d > >> > earlier > >> > > > >> backport > >> > > > >> >>> > discussions. Here's my understanding of high level > points > >> we > >> > > > >> basically > >> > > > >> >>> > agree upon. > >> > > > >> >>> > * Backporting new features to the previous major versio= n > >> > incurs > >> > > > more > >> > > > >> cost > >> > > > >> >>> > when developing new features, pushes back efforts on > making > >> > the > >> > > > >> trunk > >> > > > >> >>> > versions and reduces incentive to move to newer version= s. > >> > > > >> >>> > * Backporting new features to earlier versions (0.9x.0, > >> > 0.9x.1) > >> > > is > >> > > > >> >>> > reasonable since they are generally less stable. > >> > > > >> >>> > * Backporting new features to later version (0.9x.5, > 0.9x.6) > >> > is > >> > > > less > >> > > > >> >>> > reasonable -- (ex: a 0.94.6, or 0.94.7 should only > include > >> > > robust > >> > > > >> >>> > features). > >> > > > >> >>> > * Backporting orthogonal features (snapshots) seems les= s > >> risky > >> > > > than > >> > > > >> core > >> > > > >> >>> > changing features > >> > > > >> >>> > * An except: If multiple distributions declare intent t= o > >> > > > backport, it > >> > > > >> >>> makes > >> > > > >> >>> > sense to backport a feature. (snapshots for example). > >> > > > >> >>> > > >> > > > >> >>> > Some new circumstances and discussion topics: > >> > > > >> >>> > * We now have a dev branch (0.95) with looser compat > >> > > requirements > >> > > > >> that we > >> > > > >> >>> > could more readily release with dev/preview versions. > >> > Shouldn't > >> > > > this > >> > > > >> >>> > reduce the need to backport features to the apache stab= le > >> > > > branches? > >> > > > >> >>> Would > >> > > > >> >>> > releases of these releases "replace" the 0.x.0 or 0.x.1 > >> > > releases? > >> > > > >> >>> > * For major features in later versions we should raise > the > >> bar > >> > > on > >> > > > the > >> > > > >> >>> > amount of testing probably be more explicit about what > >> testing > >> > > is > >> > > > >> done > >> > > > >> >>> > (unit tests not suffcient, system testing > stories/resports a > >> > > > >> >>> requirement). > >> > > > >> >>> > Any other suggestions? > >> > > > >> >>> > > >> > > > >> >>> > Jon. > >> > > > >> >>> > > >> > > > >> >>> > -- > >> > > > >> >>> > // Jonathan Hsieh (shay) > >> > > > >> >>> > // Software Engineer, Cloudera > >> > > > >> >>> > // jon@cloudera.com > >> > > > >> >>> > > >> > > > >> >>> > >> > > > >> >>> > >> > > > >> >>> > >> > > > >> >>> -- > >> > > > >> >>> // Jonathan Hsieh (shay) > >> > > > >> >>> // Software Engineer, Cloudera > >> > > > >> >>> // jon@cloudera.com > >> > > > >> >>> > >> > > > >> > >> > > > > >> > > > >> > > >> > >> > >> > >> -- > >> // Jonathan Hsieh (shay) > >> // Software Engineer, Cloudera > >> // jon@cloudera.com > >> > > > > > -- > // Jonathan Hsieh (shay) > // Software Engineer, Cloudera > > // jon@cloudera.com > --=20 // Jonathan Hsieh (shay) // Software Engineer, Cloudera // jon@cloudera.com --20cf3071ca048b026604d6f81a75--