Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C32E1200B2A for ; Sat, 11 Jun 2016 02:11:31 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id C1C82160A5B; Sat, 11 Jun 2016 00:11:31 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 95386160A38 for ; Sat, 11 Jun 2016 02:11:30 +0200 (CEST) Received: (qmail 81269 invoked by uid 500); 11 Jun 2016 00:11:27 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 81110 invoked by uid 99); 11 Jun 2016 00:11:27 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 11 Jun 2016 00:11:27 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id F08E2180542 for ; Sat, 11 Jun 2016 00:11:26 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.979 X-Spam-Level: * X-Spam-Status: No, score=1.979 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=cloudera-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id OCYzHVi0_eL0 for ; Sat, 11 Jun 2016 00:11:22 +0000 (UTC) Received: from mail-yw0-f182.google.com (mail-yw0-f182.google.com [209.85.161.182]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 37AA15FACD for ; Sat, 11 Jun 2016 00:11:21 +0000 (UTC) Received: by mail-yw0-f182.google.com with SMTP id x189so80658509ywe.3 for ; Fri, 10 Jun 2016 17:11:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudera-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=wVbV/WiNCO7EWadn3bO1nRNA46DbNFrEvLkQySvns+Y=; b=hOL5qTw7UskMeeLS7d6jcFALq34SKll93SEUmxelJmlHfZtgaILrNKAS6KELDZGlpw AQ4eqIQ5/2M8k0PleD2rWp+1DSwvYk14JC94iJURr9/TDH0fT5Ka2m4gymyILOeCsYP2 027bAd32Vkk6IZlJRQqmTv7sUadW5g/77nlOUrgkbcerOVdE9wyBW6EuMC9CJBNXIpdL fleJ5+3Zp/DNOOXwdEKPOK9Y3QNrpQwHx9lrq97Flb9JpP7IDaYH/z0CF0Tu7w58rSzR cob+vIv8EAn0FJk0vGGRoWHps778ChC9ZfYYTTq7Op6UU/uVnZtOGDyqfZNa6Teqdc6M fo0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=wVbV/WiNCO7EWadn3bO1nRNA46DbNFrEvLkQySvns+Y=; b=Rqk5tTQnaqMpYcQXBYnPbwimK40Z7617QiAilHA7U8zTu4RZtviSgAbTHaFnU6ecQm oNtbJ3yGB9xnhO9QjIag24VjQfcRnXdZTatt2SvWJ3MSV0PD9cj3MpNvLTIbd7pidORP Y4AbCa/qGcYNsZmHJzI3KVjNkRlOhZj0wzsm9HJ2tGHpon1DmwN8pvyROtEBG28cRnsV vCQ0PGYr4VQfam46eTcZtM8+LulZF9p5eVKlXZEBZI/mUcrNbr5hohql1CgxCP6WB3ek NGadWzgzUMBw0jjchFAkXjiJrFEiI3fvvnO/CJqoeBcCznct/EioIXqUIMtEP6DZ0Wgk 7o2w== X-Gm-Message-State: ALyK8tIp5bYCUa2LD8cWMbm0mz4ECRbTmjq4GM4+4/Ca7UB0zohmrzKV/zLpfBMH4EJMiCGaRIRwVKRkEVpebQUd X-Received: by 10.37.65.20 with SMTP id o20mr2513729yba.32.1465603880002; Fri, 10 Jun 2016 17:11:20 -0700 (PDT) MIME-Version: 1.0 Received: by 10.13.225.151 with HTTP; Fri, 10 Jun 2016 17:11:00 -0700 (PDT) In-Reply-To: References: <1465567183392.4149@hortonworks.com> From: Andrew Wang Date: Fri, 10 Jun 2016 17:11:00 -0700 Message-ID: Subject: Re: [DISCUSS] Increased use of feature branches To: Sangjin Lee Cc: Anu Engineer , Zhe Zhang , Junping Du , Karthik Kambatla , "common-dev@hadoop.apache.org" , "hdfs-dev@hadoop.apache.org" , "mapreduce-dev@hadoop.apache.org" , "yarn-dev@hadoop.apache.org" Content-Type: multipart/alternative; boundary=001a1130c12aa667590534f57e83 archived-at: Sat, 11 Jun 2016 00:11:32 -0000 --001a1130c12aa667590534f57e83 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Let me try to clarify a few points, since not everyone might have been present for the previous emails. On the "Looking to a Hadoop 3 release" thread, we already reached consensus on doing releases from trunk. People didn't want to have to commit to another branch, and wanted to try releasing from trunk. The question, then, was how to ensure that trunk remains stable and releasable. Part of Vinod's proposal was that we, as a community, be more judicious about what we commit to trunk, and try to make use of more feature branches for larger efforts. There was no requirement that 1-2 patch changes go through a feature branch. There weren't any requirements around # of patches or length of development at all, just asking that committers be more judicious. I personally think Sangjin's rule of thumb of ~12 patches or ~1 month are about right, but it's up to the developers who are involved, and I doubt any one standard will fit all situations. So, this is about as low-overhead a policy there is: devs, please be careful when committing to trunk, and consider using a feature branch for bigger efforts. If you have further ideas about how to improve stability of trunk, I'd love to hear it. I'd hope though that the above would be a non-controversial statement. Best, Andrew On Fri, Jun 10, 2016 at 2:10 PM, Sangjin Lee wrote: > Thanks for your thoughts Anu. > > Regarding your question > >> And then comes the question, once 3.0 becomes official, where do we >> check-in a change, if that would break something? so this will lead us >> back to trunk being the unstable =E2=80=93 3.0 being the new =E2=80=9Cbr= anch-2=E2=80=9D. > > > Andrew mentioned in the original email > >> Regarding "trunk-incompat", since we're still in the alpha stage for >> 3.0.0, there's no need for this branch yet. This aspect of Vinod's propo= sal >> was still under a bit of discussion; Chris Douglas though we should cut = a >> branch-3 for the first 3.0.0 beta, which aligns with my original thinkin= g. >> This point doesn't necessarily need to be resolved now though, since aga= in >> we're still doing alphas. > > > and I agree with that sentiment. I think even if we have a > "trunk-incompat" branch to hold future incompatible changes, the situatio= n > will change little from today. Instead of dealing with "trunk" (where > incompatible changes may appear) and "branch-3", we would be dealing with > "trunk-incompat" and "trunk". Names are largely mnemonics then. > > > On Fri, Jun 10, 2016 at 12:37 PM, Anu Engineer > wrote: > >> I actively work on two branches (Diskbalancer and ozone) and I agree wit= h >> most of what Sangjin said. >> There is an overhead in working with branches, there are both technical >> costs and administrative issues >> which discourages developers from using branches. >> >> I think the biggest issue with branch based development is that fact tha= t >> other developers do not use a branch. >> If a small feature appears as a series of commits to =E2=80=9C=E2=80=9Dd= atanode.java=E2=80=9D=E2=80=9D, >> the branch based developer ends up rebasing >> and paying this price of rebasing many times. If everyone followed a >> model of branch + Pull request, other branches >> would not have to deal with continues rebasing to trunk commits. If we >> are moving to a branch based >> development, we should probably move to that model for most development >> to avoid this tax on people who >> actually end up working in the branches. >> >> I do have a question in my mind though: What is being proposed is that w= e >> move active development to branches >> if the feature is small or incomplete, however keep the trunk open for >> check-ins. One of the biggest reason why we >> check-in into trunk and not to branch-2 is because it is a change that >> will break backward compatibility. So do we >> have an expectation of backward compatibility thru the 3.0-alpha series >> (I personally vote No, since 3.0 is experimental >> at this stage), but if we decide to support some sort of backward-compac= t >> then willy-nilly committing to trunk >> and still maintaining the expectation we can release Alphas from 3.0 doe= s >> not look possible. >> >> And then comes the question, once 3.0 becomes official, where do we >> check-in a change, if that would break something? >> so this will lead us back to trunk being the unstable =E2=80=93 3.0 bein= g the new >> =E2=80=9Cbranch-2=E2=80=9D. >> >> One more point: If we are moving to use a branch always =E2=80=93 then w= e are >> looking at a model similar to using a git + pull >> request model. If that is so would it make sense to modify the rules to >> make these branches easier to merge? >> Say for example, if all commits in a branch has followed review and >> checking policy =E2=80=93 just like trunk and commits >> have been made only after a sign off from a committer, would it be >> possible to merge with a 3-day voting period >> instead of 7, or treat it just like today=E2=80=99s commit to trunk =E2= =80=93 but with 2 >> people signing-off? >> >> What I am suggesting is reducing the administrative overheads of using a >> branch to encourage use of branching. >> Right now it feels like Apache=E2=80=99s process encourages committing d= irectly >> to trunk than a branch >> >> Thanks >> Anu >> >> >> On 6/10/16, 10:50 AM, "sjlee0@gmail.com on behalf of Sangjin Lee" < >> sjlee0@gmail.com on behalf of sjlee@apache.org> wrote: >> >> >Having worked on a major feature in a feature branch, I have some >> thoughts >> >and observations on feature branch development. >> > >> >IMO feature branch development v. direct commits to trunk in piecemeal = is >> >really a choice of *granularity*. Do we want a series of fine-grained >> state >> >changes on trunk or fewer coarse-grained chunks of commits on trunk? >> > >> >This makes me favor a branch-based development model for any >> "decent-sized" >> >features (we'll need to define "decent-sized" of course). Once you have >> >coarse-grained changes, it's easier to reason about what made what >> release >> >and in what state. As importantly, it makes it easier to back out a >> >complete feature fairly easily if that becomes necessary. My totally >> >unscientific suggestion may be if a feature takes more than dozen commi= ts >> >and longer than a month, we should probably have a bias towards a featu= re >> >branch. >> > >> >Branch-based development also makes you go faster if your feature is >> >larger. I wouldn't do it the other way for timeline service v.2 for >> example. >> > >> >That said, feature branches don't come for free. Now the onus is on the >> >feature developer to constantly rebase with the trunk to keep it >> reasonably >> >integrated with the trunk. More logistics is involved for the feature >> >developer. Another big question is, when a feature branch gets big and >> it's >> >time to merge, would it get as scrutinized as a series of individual >> >commits? Since the size of merge can be big, you kind of have to rely o= n >> >those feature committers and those who help them. >> > >> >In terms of integrating/stabilizing, I don't think branch development >> >necessarily makes it harder. It is again granularity. In case of direct >> >commits on trunk, you do a lot more fine-grained integrations. In case = of >> >branch development, you do far fewer coarse-grained integrations via >> >rebasing. If more people are doing branch-based development, it makes >> >rebasing easier to manage too. >> > >> >Going back to the related topic of where to release (trunk v. branch-X)= , >> I >> >think that is more of a proxy of the real question of "how do we mainta= in >> >quality and stability of the trunk?". Even if we release from the trunk= , >> if >> >our bar for merging to trunk is low, the quality will not improve >> >automatically. So I think we ought to tackle the quality question first= . >> > >> >My 2 cents. >> > >> > >> >On Fri, Jun 10, 2016 at 8:57 AM, Zhe Zhang wrote: >> > >> >> Thanks for the notes Andrew, Junping, Karthik. >> >> >> >> Here are some of my understandings: >> >> >> >> - Trunk is the "latest and greatest" of Hadoop. If a user starts usin= g >> >> Hadoop today, without legacy workloads, trunk is what he/she should >> use. >> >> - Therefore, each commit to trunk should be transactional -- atomic, >> >> consistent, isolated (from other uncommitted patches); I'm not so sur= e >> >> about durability, Hadoop might be gone in 50 years :). As a committer= , >> I >> >> should be able to look at a patch and determine whether it's a >> >> self-contained improvement of trunk, without looking at other >> uncommitted >> >> patches. >> >> - Some comments inline: >> >> >> >> On Fri, Jun 10, 2016 at 6:56 AM Junping Du >> wrote: >> >> >> >> > Comparing with advantages, I believe the disadvantages of shipping >> any >> >> > releases directly from trunk are more obvious and significant: >> >> > - A lot of commits (incompatible, risky, uncompleted feature, etc.) >> have >> >> > to wait to commit to trunk or put into a separated branch that coul= d >> >> delay >> >> > feature development progress as additional vote process get involve= d >> even >> >> > the feature is simple and harmless. >> >> > >> >> Thanks Junping, those are valid concerns. I think we should clearly >> >> separate incompatible with uncompleted / half-done work in this >> >> discussion. Whether people should commit incompatible changes to trun= k >> is a >> >> much more tricky question (related to trunk-incompat etc.). But per m= y >> >> comment above, IMHO, *not committing uncompleted work to trunk* shoul= d >> be a >> >> much easier principle to agree upon. >> >> >> >> >> >> > - For small feature with only 1 or 2 commits, that need three +1 fr= om >> >> PMCs >> >> > will increase the bar largely for contributors who just start to >> >> contribute >> >> > on Hadoop features but no such sufficient support. >> >> > >> >> Development overhead is another valid concern. I think our >> rule-of-thumb >> >> should be that, small-medium new features should be proposed as a >> single >> >> JIRA/patch (as we recently did for HADOOP-12666). If the complexity >> goes >> >> beyond a single JIRA/patch, use a feature branch. >> >> >> >> >> >> > >> >> > Given these concerns, I am open to other options, like: proposed by >> Vinod >> >> > or Chris, but rather than to release anything directly from trunk. >> >> > >> >> > - This point doesn't necessarily need to be resolved now though, >> since >> >> > again we're still doing alphas. >> >> > No. I think we have to settle down this first. Without a common >> agreed >> >> and >> >> > transparent release process and branches in community, any release >> >> (alpha, >> >> > beta) bits is only called a private release but not a official apac= he >> >> > hadoop release (even alpha). >> >> > >> >> > >> >> > Thanks, >> >> > >> >> > Junping >> >> > ________________________________________ >> >> > From: Karthik Kambatla >> >> > Sent: Friday, June 10, 2016 7:49 AM >> >> > To: Andrew Wang >> >> > Cc: common-dev@hadoop.apache.org; hdfs-dev@hadoop.apache.org; >> >> > mapreduce-dev@hadoop.apache.org; yarn-dev@hadoop.apache.org >> >> > Subject: Re: [DISCUSS] Increased use of feature branches >> >> > >> >> > Thanks for restarting this thread Andrew. I really hope we can get >> this >> >> > across to a VOTE so it is clear. >> >> > >> >> > I see a few advantages shipping from trunk: >> >> > >> >> > - The lack of need for one additional backport each time. >> >> > - Feature rot in trunk >> >> > >> >> > Instead of creating branch-3, I recommend creating branch-3.x so we >> can >> >> > continue doing 3.x releases off branch-3 even after we move trunk t= o >> 4.x >> >> (I >> >> > said it :)) >> >> > >> >> > On Thu, Jun 9, 2016 at 11:12 PM, Andrew Wang < >> andrew.wang@cloudera.com> >> >> > wrote: >> >> > >> >> > > Hi all, >> >> > > >> >> > > On a separate thread, a question was raised about 3.x branching >> and use >> >> > of >> >> > > feature branches going forward. >> >> > > >> >> > > We discussed this previously on the "Looking to a Hadoop 3 releas= e" >> >> > thread >> >> > > that has spanned the years, with Vinod making this proposal >> (building >> >> on >> >> > > ideas from others who also commented in the email thread): >> >> > > >> >> > > >> >> > > >> >> > >> >> >> http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201604.mbox/b= rowser >> >> > > >> >> > > Pasting here for ease: >> >> > > >> >> > > On an unrelated note, offline I was pitching to a bunch of >> >> > > contributors another idea to deal >> >> > > with rotting trunk post 3.x: *Make 3.x releases off of trunk >> directly*. >> >> > > >> >> > > What this gains us is that >> >> > > - Trunk is always nearly stable or nearly ready for releases >> >> > > - We no longer have some code lying around in some branch (today= =E2=80=99s >> >> > > trunk) that is not releasable >> >> > > because it gets mixed with other undesirable and incompatible >> changes. >> >> > > - This needs to be coupled with more discipline on individual >> >> > > features - medium to to large >> >> > > features are always worked upon in branches and get merged into >> trunk >> >> > > (and a nearing release!) >> >> > > when they are ready >> >> > > - All incompatible changes go into some sort of a trunk-incompat >> >> > > branch and stay there till >> >> > > we accumulate enough of those to warrant another major release. >> >> > > >> >> > > Regarding "trunk-incompat", since we're still in the alpha stage >> for >> >> > 3.0.0, >> >> > > there's no need for this branch yet. This aspect of Vinod's >> proposal >> >> was >> >> > > still under a bit of discussion; Chris Douglas though we should >> cut a >> >> > > branch-3 for the first 3.0.0 beta, which aligns with my original >> >> > thinking. >> >> > > This point doesn't necessarily need to be resolved now though, >> since >> >> > again >> >> > > we're still doing alphas. >> >> > > >> >> > > What we should get consensus on is the goal of keeping trunk >> stable, >> >> and >> >> > > achieving that by doing more development on feature branches and >> being >> >> > > judicious about merges. My sense from the Hadoop 3 email thread >> (and >> >> the >> >> > > more recent one on the async API) is that people are generally in >> favor >> >> > of >> >> > > this. >> >> > > >> >> > > We're just about ready to do the first 3.0.0 alpha, so would >> greatly >> >> > > appreciate everyone's timely response in this matter. >> >> > > >> >> > > Thanks, >> >> > > Andrew >> >> > > >> >> > >> >> > -------------------------------------------------------------------= -- >> >> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org >> >> > For additional commands, e-mail: common-dev-help@hadoop.apache.org >> >> > >> >> > >> >> >> >> > --001a1130c12aa667590534f57e83--