Return-Path: Delivered-To: apmail-hadoop-mapreduce-dev-archive@minotaur.apache.org Received: (qmail 64475 invoked from network); 29 May 2010 06:20:10 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 29 May 2010 06:20:10 -0000 Received: (qmail 63643 invoked by uid 500); 29 May 2010 06:20:10 -0000 Delivered-To: apmail-hadoop-mapreduce-dev-archive@hadoop.apache.org Received: (qmail 62875 invoked by uid 500); 29 May 2010 06:20:09 -0000 Mailing-List: contact mapreduce-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-dev@hadoop.apache.org Delivered-To: mailing list mapreduce-dev@hadoop.apache.org Received: (qmail 62482 invoked by uid 99); 29 May 2010 06:20:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 29 May 2010 06:20:07 +0000 X-ASF-Spam-Status: No, hits=-0.4 required=10.0 tests=AWL,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.160.48] (HELO mail-pw0-f48.google.com) (209.85.160.48) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 29 May 2010 06:20:02 +0000 Received: by pwj10 with SMTP id 10so1314042pwj.35 for ; Fri, 28 May 2010 23:19:41 -0700 (PDT) MIME-Version: 1.0 Received: by 10.142.210.17 with SMTP id i17mr972778wfg.83.1275113981744; Fri, 28 May 2010 23:19:41 -0700 (PDT) Received: by 10.143.16.21 with HTTP; Fri, 28 May 2010 23:19:41 -0700 (PDT) In-Reply-To: References: Date: Fri, 28 May 2010 23:19:41 -0700 Message-ID: Subject: Re: Contributor Meeting Minutes 05/28/2010 From: Eli Collins To: general@hadoop.apache.org Cc: mapreduce-dev@hadoop.apache.org, hdfs-dev@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable The list stripped my slides. Posted notes to the wiki, which doesn't seem to allow attachments so not sure where to put slides. http://wiki.apache.org/hadoop/HadoopContributorsMeeting20100528 On Fri, May 28, 2010 at 7:59 PM, Eli Collins wrote: > Slides attached. =A0Thanks for taking notes Chris! > > > On Fri, May 28, 2010 at 5:37 PM, Chris Douglas wrot= e: >> This month, the MapReduce + HDFS contributor meeting was held at >> Cloudera Headquarters. >> >> Announcements for contributor meetings are here: >> http://www.meetup.com/Hadoop-Contributors/ >> >> Minutes follow. No decisions were made at this meeting, but the >> following issues were discussed and may presage future discussion and >> decisions on these lists. >> >> Eli, I think you have all the slides. Would you mind sending them out? -= C >> >> =3D=3D 0.21 release update =3D=3D >> * Continuing to close blockers, ping people for updates and suggestions >> * About 20 open blockers. Many are MapReduce documentation that may be >> pushed. Speak up if 0.21 is missing anything substantive. >> * Common/HDFS visibility and annotations are close to consensus; >> MapReduce annotations are committed to trunk and the 0.21 branch >> >> =3D=3D HEP proposal =3D=3D >> (what follows is the sketch presented at the meeting. A full proposal >> with concrete details will be circulated on the list) >> >> * Based on- and very similar to- the PEP (Python Enhancement Proposal) P= rocess >> * Audience is HDFS and MapReduce; not necessarily adopted by other subpr= ojects >> =A0- Addresses the perception that there is friction between >> innovation/experimentation and stability >> * Not for small enhancements, features, and bug fixes. This should not >> slow down typical development or impede casual contribution to Hadoop >> * Primary mechanism for new features, collecting input, documenting >> design decisions >> * JIRA is good for details, but not for deciding on wide shifts in direc= tion >> * Purpose is for author to build consensus and gather dissenting opinion= s. >> =A0- All may comment, but Editors will review incoming HEP material >> =A0- Editors determine only whether the HEP is complete, not whether >> they believe it is a sound idea >> =A0- Editors are appointed by the PMC >> =A0- Mechanism for appointing Editors and term of service TBD >> =A0 =A0- Apache Board appoints Shepherds for projects somewhat randomly, >> to projects. A similar mechanism could work for incoming HEPs >> =A0- Proposal *may* come with code, but not necessarily. >> Drafting/baking of the HEP occurs in public on a list dedicated to >> that particular proposal. Once Editors certify the HEP as complete, it >> is sent to general@ for wider discussion. >> =A0 =A0- The discussion phase begins on general@. The mailing list exist= s >> to ensure the HEP is complete enough to present to the community. >> =A0- Some discussion on the difference between posting to general@ and >> posting to the HEP list. Completeness is, of course, subjective. If >> the Editor and Author disagree whether the proposal affects an aspect >> of the framework enough to merit special consideration, it is not >> entirely clear how to resolve the disagreement. >> =A0 =A0- In general, the role of the Editor in the community-driven >> process of Hadoop is not entirely clear. It may be possible to >> optimize it out. >> =A0- Once discussion ends, the HEP is passed (or fails to pass) by a >> vote of the PMC (mechanics undefined). In Python, the result is >> committed to the repository. A similar practice would make sense in >> Hadoop. >> * Which issues require HEPs? >> =A0- Discussion ranged. Append, backup namenode, edit log rewrite, et >> al. were examples of features substantial enough to merit a HEP. Pure >> Java CRC is an example of an enhancement that would not. Whether an >> explicit process must be in place to determine whether an issue >> requires a HEP is not clear. >> =A0- Viewing HEPs as a way of soliciting consensus for an approach >> might be more accurate. Going through the HEP process should always >> improve the chances of a successful proposal >> >> * Evaluation >> =A0- The proposal may be rejected if it is redundant with existing >> functionality, technically unsound, insufficiently motivated, no >> backwards compatibility story, etc. >> =A0- Implementation is not necessary, and is lightly discouraged. >> Feedback is less welcome once code is in hand. >> =A0- Purpose is to be clear about the acceptance criteria for that >> issue, e.g. concerns that the proposal may not scale or may harm >> performance >> =A0- Dissenting opinions must be recorded accurately. Quoting would be >> a safe practice for the Author to encourage HEP reviewers not to block >> the product of the proposal. >> >> * The testing burden and completion strategy may be ambiguous >> =A0- Whether the proposal affects scalability may not be testable by >> the implementer. Completing the proposal to address all use cases may >> require considerably more work than the Author is willing or motivated >> to invest. >> =A0- The HEP discussion on general@ should explore whether such >> objections are merited and reasonable. For example, a particularly >> obscure/esoteric use case could be included as a condition for >> acceptance if the dissenter is willing to invest the resources to >> test/validate it. The process is flexible in this regard. >> =A0 =A0- But it is not infinitely flexible. Backwards compatibility, >> performance regression, availability, and other considerations need >> not be called out in every HEP. >> =A0 =A0- Traditional concerns need to be documented. Acceptance criteria >> should ideally be automated and reproducible in different >> organizations >> >> =3D=3D Branching =3D=3D >> * A patch and a branch are isomorphic from a policy perspective. Of >> course, they are functionally distinct: branches are easier to >> collaborate on and are, generally, longer-lived than are patches. But >> special policies need not be derived to account for these differences, >> which concern the production of the code, not its review and >> acceptance. >> * Some developers find branches to be easier to review than very large >> patches and easier to merge, given a toolchain that supports this. >> =A0- Subversion currently is difficult to adapt to this model >> =A0- Could be done on a HEP-by-HEP basis, as a condition for acceptance >> * Eclipse Labs >> =A0- Branded version of Google Code (same functionality, w/ Eclipse bran= d) >> =A0- Not official Eclipse projects, but associated with Eclipse >> =A0- Apache/Hadoop may consider a similar strategy >> =A0- Distinct from Apache Labs, as one need not be a committer, follow >> its rules for releases, etc. >> >> =3D=3D Contrib =3D=3D >> * Modules (such as fuse-dfs) are not actively maintained in the main >> repository and would benefit from a release schedule decoupled from >> the rest of Hadoop >> * With few exceptions, the contrib modules have smaller, often >> discrete groups of maintainers. It may be worth exploring whether >> these projects could live elsewhere >> >