Return-Path: X-Original-To: apmail-hadoop-common-dev-archive@www.apache.org Delivered-To: apmail-hadoop-common-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D64D4189D9 for ; Sat, 31 Oct 2015 17:58:51 +0000 (UTC) Received: (qmail 38631 invoked by uid 500); 31 Oct 2015 17:58:50 -0000 Delivered-To: apmail-hadoop-common-dev-archive@hadoop.apache.org Received: (qmail 38558 invoked by uid 500); 31 Oct 2015 17:58:50 -0000 Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-dev@hadoop.apache.org Received: (qmail 38547 invoked by uid 99); 31 Oct 2015 17:58:50 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 31 Oct 2015 17:58:50 +0000 Received: from mail-wm0-f45.google.com (mail-wm0-f45.google.com [74.125.82.45]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id E5F241A0230 for ; Sat, 31 Oct 2015 17:58:49 +0000 (UTC) Received: by wmec75 with SMTP id c75so32766477wme.1 for ; Sat, 31 Oct 2015 10:58:48 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.28.129.7 with SMTP id c7mr4671151wmd.91.1446314328724; Sat, 31 Oct 2015 10:58:48 -0700 (PDT) Received: by 10.195.11.74 with HTTP; Sat, 31 Oct 2015 10:58:48 -0700 (PDT) In-Reply-To: References: <19E3715B-5663-469B-87A1-153E7B24A5E7@hortonworks.com> <56332B13.9020904@oss.nttdata.co.jp> Date: Sat, 31 Oct 2015 10:58:48 -0700 Message-ID: Subject: Re: Github integration for Hadoop From: "Colin P. McCabe" To: Hadoop Common Content-Type: text/plain; charset=UTF-8 Thanks for your responses here. It sounds like the proposal here is for doing code reviews on GH, but still doing commits in our existing way. Since it wasn't spelled out in the initial proposal, I interpreted it as doing both reviews and commits on GH, like Spark does-- which I think is problematic for all the reasons we've discussed here (the fact that GH introduces merge commits, the possibility of bypassing jira, duplicate pull requests with no search features to dedup them, etc. etc.) Nobody has really come up with a solution for the problems caused by __committing__ through GH that scales to our size of community. If there is a general consensus that __code reviews__ through GH would be helpful, I will change my -1 to a +0 for that. But let's make sure that we are not __commiting__ through GH. I view this as kind of an experiment to see how much easier things are this way, so I will try to keep an open mind. In parallel with this experiment, I also think we should set up a gerrit instance that supports code reviews and precommit testing. As I said, Cloudera uses gerrit internally and we are very happy with it. It is nicer than GH because we can set up our own precommit hooks. For example, we can reject gerrit change requests that don't have a jira number associated with them. Gerrit change requests can be created entirely from the command line as well. Gerrit is open source, and doesn't create merge commits for everything if you commit through it. I think we can support multiple solutions in parallel and let people gravitate to the most convenient one, as long as we keep our project history accessible on JIRA and the mailing lists. Also, as Andrew commented, let's make sure we are not setting up duplicate bug trackers or mailing lists on GH-- one of each of those is enough :) Colin On Sat, Oct 31, 2015 at 4:40 AM, Steve Loughran wrote: > >> On 30 Oct 2015, at 17:15, Colin P. McCabe wrote: >> >> I think the Spark guys eventually built some kind of UI on top of >> github to help them search through pull requests. We would probably >> also need something like this. > > https://spark-prs.appspot.com/users > > > They do have to impose naming scheme on those patches to help identify the area. You can just watch a JIRA and wait for a pull-req to arrive. > >> >> Spark uses github partially because it started as a github project, so >> everyone was familiar with that. I haven't seen an answer to Andrew's >> question about what the value add is here for Hadoop to move to a new >> system. I have seen a few comments about a better review UI and >> one-click patch submission, is that the main goal? > > I do think it is good for a very fast cycle time on reviews, though that depends, of course on reviewers willing to put in the time (credit to your colleagues here, Colin). > >