hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Collins <...@cloudera.com>
Subject Re: [DISCUSSION] Proposal for making core Hadoop changes
Date Tue, 29 Jun 2010 03:03:55 GMT
Hey Konstantin,

Apologies for the delay, busy with stuff for the summit.  I'll get a
concrete proposal to general based on our discussion at the
contributor's meeting out this week.


On Mon, Jun 28, 2010 at 5:50 PM, Konstantin Shvachko <shv@yahoo-inc.com> wrote:
> Eli,
> Just checking on the status of this proposal.
> In the past I was hesitant about introducing more formalities.
> I now think we really need some mechanism for
> new feature and project proposals, also tracking decisions.
> For the reasons exactly as you describe in your email.
> Whether it is going to be HEP or something else, it is best
> if we adopt it soon.
> Thanks,
> --Konstantin
> On 5/21/2010 1:42 PM, Eli Collins wrote:
>> As HDFS and MapReduce have matured the cost and complexity of
>> introducing features has grown. Each new feature has to consider
>> interactions with a growing set of existing features, a growing user
>> base (upgrades, backwards compatibility) and additional use cases
>> (more and more projects now build on them). At the same time we don't
>> want the high bar for contribution to unnecessarily hinder new
>> development and releases.
>> Many projects at a similar stage address this by adopting a more
>> formal way to describe, socialize and shepherd enhancements to their
>> platforms. Today, new features are often discussed via an umbrella
>> jira, which may have an attached design document. There are a number
>> of issues with this approach. The design documents vary in format and
>> quality, and are often reviewed by a limited audience. They aren't
>> version controlled. Sometimes the proposal is only partially
>> specified. Jiras are often ignored. Understanding a proposal and it's
>> implications through a series of threads in the jira comments is
>> difficult. It's hard for contributors and users to find these
>> top-level jiras and follow their status.
>> I'd like to propose that core Hadoop adopts something similar to
>> Python's PEP (Python Enhancement Proposal) [1]. A "HEP" would be a
>> single primary mechanism for proposing new features, incorporating
>> community feedback, and recording decisions. The author of the HEP
>> would be responsible for building consensus and moving the feature
>> forward. Similarly, some subset of the community would be responsible
>> for reviewing HEPs in a timely manner and identifying missing pieces
>> in the proposal. Discussion would occur before patches showed up on
>> jira. People interested in the core Hadoop roadmap could keep an eye
>> on the HEPs without the overhead of following jira traffic.
>> Why base this on the PEP? The format has proven useful to a
>> substantial existing project, and I think the workflow is not too
>> heavy-weight, and well-suited to a community such as ours. That being
>> said, we could discuss other models (eg Java's JSR).
>> Before we get into specifics, is this something the community would
>> like to adopt in some form? Does adapting the PEP and its workflow to
>> our projects, community and bylaws seem reasonable?
>> Thanks,
>> Eli
>> 1. http://www.python.org/dev/peps/pep-0001

View raw message