Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 05184D6F0 for ; Wed, 5 Sep 2012 23:49:09 +0000 (UTC) Received: (qmail 53097 invoked by uid 500); 5 Sep 2012 23:49:08 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 53049 invoked by uid 500); 5 Sep 2012 23:49:08 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 53041 invoked by uid 99); 5 Sep 2012 23:49:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Sep 2012 23:49:08 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of todd@cloudera.com designates 209.85.212.173 as permitted sender) Received: from [209.85.212.173] (HELO mail-wi0-f173.google.com) (209.85.212.173) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Sep 2012 23:49:02 +0000 Received: by wibhm6 with SMTP id hm6so4531762wib.2 for ; Wed, 05 Sep 2012 16:48:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=idtyEeKM3mS8PtuOd2inZ6JjjxjqsTRlwILv723XBpg=; b=N+hQyBk+yJ6R+ZrmntcJ2eQLd70x4/oynl+zmOKHDjU+hppDPc44ivCJE4fVx6iNnB L2NgtNxEwovTRPf5BecSN/JUhCSCE+n23mzC6GLr5Ss7Bi+xUBqj7pU5gKW6G4zDaU4e ikQ5x9lLSop1+ESWnfjwxO31y6xuSXm9XLLnVi8B6ReQg1CGY3FvDP+nHKv7VzFEo/1O QqjJLjOeX5i2pdy7ZKZ4WhU0lP1Qt6pS6UG0HgHE4xVHlLxn2491+oqpFf8tAw/+acW4 0pNurKxJaLCHmlnyGaXuVYB4SVSa3I94EnWyWJJBSAndzJCKKHLNIEmd9aSAFV8n80FD jllg== Received: by 10.180.7.200 with SMTP id l8mr511774wia.9.1346888922324; Wed, 05 Sep 2012 16:48:42 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.208.224 with HTTP; Wed, 5 Sep 2012 16:48:22 -0700 (PDT) In-Reply-To: References: From: Todd Lipcon Date: Wed, 5 Sep 2012 16:48:22 -0700 Message-ID: Subject: Re: Thoughts about large feature dev branches To: dev@hbase.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQljDPXabo5Kstkr110doD8Cn1CuDggFZ3+EE86cV0XDsprkhAS0pviT8m3nR8t4y5qtfK8w Hope to have time to write up some more thoughts later, but some interesting reading is this document from Linux on how to contribute to that project: https://github.com/mirrors/linux-2.6/blob/master/Documentation/SubmittingPatches Worth looking at other projects' guidelines to form our own if we're thinking of going this route. -Todd On Wed, Sep 5, 2012 at 4:43 PM, Jesse Yates wrote: > On Wed, Sep 5, 2012 at 3:58 PM, Elliott Clark wrote: > >> +1 on git, either on github or closer to the linux model with real >> distributed repos. >> >> - I've been using it for just about all of my development and it works >> pretty nicely. I push everything to github as I'm working. Then I >> squash commits and create a diff to post on jira. >> > > I do the same, just locally. Solid model. > > >> - I would suggest that since hbase's code base moves so rapidly, a >> rebased branch should probably be a requirement before merging. >> Otherwise the merge will get pretty interesting for very long lived >> branches. >> > > IIRC when Todd was working on some large stuff for HDFS he was doing this > in a feature branch every few days. Seriously helps with when things are > actually finished in terms of rolling it back in. > > Using github to keep a constantly rebased version (every few days) would be > a reasonble, super-low friction way of solving the problem for > non-committers. Further, for big changes, it would ensure that if the > people go away we aren't left with a bunch of dangling branches in the svn. > Problem here is also establishing the 'master' branch in github, though > that can be established on a case-by-case basis with the people involved. > >> >> On Wed, Sep 5, 2012 at 11:38 AM, Jonathan Hsieh wrote: >> > This has been brought up in the past but we are here again. >> > >> > We have a few large features that are hanging out and having a hard time >> > because trunk changes underneath it and in some cases because they are >> > being worked by folks without a commit bit. (ex: snapshots w/ Jesse and >> > Matteo, and have some other potentially in the pipeline -- major >> assignment >> > > I'm generally opposed to doing feature branches for a variety of reasons > (left behind functionality, hard to roll back in, difficulty of testing, > etc) and further don't really feel its really necessary for the snapshot > code given that the code doesn't touch all that much of the current > codebase. > > A lot of the pain with it right now is that the code has been broken into 5 > patches, making it hard to build a version of HBase that has snapshots 'in > its current form'. This gets even worse as I'm planning on doing a bit more > refactoring into a couple more patches to help make it more digestable > (e.g. see latest patch for 3PC https://reviews.apache.org/r/6592/ which > pulls out a lot of the coordination functionality)). This helps with > reviews, etc, but makes it a bit of a pain for people who want to do > advanced testing on the feature - hard to justify doing a lot of that work > though as if the code is changing a lot, then testing doesn't make much > sense. > > In terms of how the work is breaking down, with Matteo doing restore on top > of the taking that I'm working on, his part clearly depends on the taking > of snapshots. However, the filesystem layout hasn't changed at all in > nearly the last two months, meaning the work can proceed pretty much > independently (more or less). > > >> > manager changes with Jimmy and possibly me, >> > > This is a lot more high-touch with the codebase, making a branch (either in > sandbox or otherwise) more feasible. > > >> HBASE-4120, HBASE-2600, >> > removing root) >> > > Salesforce is planning on tackling at least the latter two in the next few > months, so this is something that we need to figure out :) > > >> > >> > Though I wasn't around yet, it seems like this is what we did for >> > coprocs/security, probably for the 0.90 master. >> > >> http://search-hadoop.com/m/byzZYZMktx1/hbase+windows&subj=Re+Proposed+feature+branch+for+HBase+security >> > >> > Where the folks working on those features committers at the time? What >> do >> > we do for contributions from folks who aren't committers yet? >> > >> > This was proposed over on hadoop-general by Todd -- what do you all think >> > about doing something like this for the major changes? (Github seems >> > easiest, svn seems "more official"). >> > >> > Here's one proposal, making use of git as an easy way to allow >> > non-committers to "commit" code while still tracking development in >> > the usual places: >> > - Upon anyone's request, we create a new "Version" tag in JIRA. >> > - The developers create an umbrella JIRA for the project, and file the >> > individual work items as subtasks (either up front, or as they are >> > developed if using a more iterative model) >> > - On the umbrella, they add a pointer to a git branch to be used as >> > the staging area for the branch. As they develop each subtask, they >> > can use the JIRA to discuss the development like they would with a >> > normally committed JIRA, but when they feel it is ready to go (not >> > requiring a +1 from any committer) they commit to their git branch >> > instead of the SVN repo. >> > - When the branch is ready to merge, they can call a merge vote, which >> > requires +1 from 3 committers, same as a branch being proposed by an >> > existing committer. A committer would then use git-svn to merge their >> > branch commit-by-commit, or if it is less extensive, simply generate a >> > single big patch to commit into SVN. >> > > Overall, this seems reasonable. I can imagine the work to merge back in > being a huge pain. It would be great to see if we can break down these big > changes into smaller patches and roll them in one at a time. Both in terms > of ease on a single committer as helping to ensure code quality of each > sub-piece; its easier to enforce good testing on smaller pieces and helps > with code reuse. > > My comments above obviously contradict this a little bit - its a huge pain > to work on the end functionality when the sub-pieces that you are building > on shift due to code reviews. In the end it leads to a better foundation, > but can be headache to keep everything in sync. > > The latter goes away a bit if we have a single branch with the majority of > the code then progressive commits to fix things, but still is terrible to > review (pot calling the kettle black here) that first massive code drop. > > TL;DR prefer smaller, independently useful patches that build to the bigger > change. Its may not be possible for some features, but should make it > easier to review, roll in, and in the end merge the final change while > being more generally useful. > > >> >> > Another alternative, if people are reluctant to use git, would be to >> > add a "sandbox/" repository inside our SVN, and hand out commit bit to >> > branches inside there without any PMC vote. Anyone interested in >> > contributing could request a branch in the sandbox, and be granted >> > access as soon as they get an apache SVN account. >> > >> > > This seems a little excessive. It would be nice for the more 'official' > status this confers, but seems to create more friction than its worth > (IMO). > > > TL;DR github with 'official' branches per umbrella JIRA seems a > low-friction way to do feature branches without the possiblitly of cruft in > the main repository. We should really be sure that we need a branch though > and still favoring smaller patches along the same branch for generally > useful features. > > ------------------- > Jesse Yates > @jesse_yates > jyates.github.com -- Todd Lipcon Software Engineer, Cloudera