hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roy T. Fielding" <field...@gbiv.com>
Subject Re: [VOTE] Direction for Hadoop development
Date Tue, 07 Dec 2010 22:37:42 GMT
On Dec 7, 2010, at 9:18 AM, Doug Cutting wrote:

> On 12/06/2010 05:09 PM, Roy T. Fielding wrote:
>> Generally speaking, vetoing extension interfaces without a compelling
>> technical reason is not the way Apache operates.  We make extensions
>> modular so that diverse collaborators can specialize according to
>> their own needs, not just your needs.
> 
> Roy, thanks for your thoughts here.
> 
> I have not intended to veto an extension mechanism.  In this case, we already have an
extension mechanism.

It is my understanding that the existing extension mechanism did not
support the desired extensions, so improving it makes sense.

>  The proposal is to change the extension mechanism incompatibly with unclear benefits,

Good, these are technical reasons.  The benefits can be cleared by docs.
By incompatible, I assume you mean forward-compatibility of old versions
of Hadoop reading newer files.  Can we fix that by having the new
implementation use the old file format by default until it is configured
to use one of the new interfaces for writing?

> add implementations of several extensions to the kernel, and incompatibly change a widely-used
file format.

You keep referring to the kernel as if it were a product.  I don't see
a kernel product in the list of things released by Apache Hadoop.
If there were such a product, then it would make sense for Apache Hadoop
to also release ancillary products for common libraries, test frameworks,
and modular storage interfaces.  Rearchitecting the Hadoop product suite
into such a logical arrangement would make sense, and after such an
architecture is put into place then "keeping the kernel simple" would
be a reason to veto a change to the kernel.

> In general I support improving extension mechanisms, but oppose gratuitous changes to
file formats and the inclusion of new user-level functionality in the kernel.

Persistence is not usually considered user-level functionality, nor do
the proposed changes seem gratuitous.  Owen said the reason was to
support type-safety, which may well be a desirable feature for some
users.  I think it makes sense to find a way to modularize the feature
such that this functionality is only brought in when configured
by the user.

>  I'd like the issue to focus solely on the extension mechanism to clarify the discussion,
not on adding extensions to the kernel or file formats.

That is irrelevant.  Doing development via jira discussion is inherently
dysfunctional because it promotes such bureaucratic nonsense instead of
working towards a common solution via iterative development.  My goal
here is to fix this goofy behavior, not reinforce it.

> Tom long ago provided patches showing how the existing configuration system can provide
equivalent extension implementations outside of the kernel with no incompatible changes. 
(MAPREDUCE-376 and MAPREDUCE-377)

They both seem to be active and unfinished.  If they are equivalent fixes
to the same problem, then I suggest applying them to a branch, documenting
how they work, and then agreeing to have a bake-off.  A bake-off is a
decision made by performance and feature-completeness as an objective
way to resolve an impasse due to mutually exclusive vetoes.  All sides agree
to drop the veto and accept whichever performs best, by majority decision.

>> Changes to the existing products,
>> including plans like the one Owen described, are subject to vote if anyone
>> disagrees with them.
> 
> Is this described somewhere?  The HTTPD page says, "Long term plans are simply announcements
that group members are working on particular issues related to the Apache software. These
are not voted on [...]."

All action items can be voted on.  What we are talking about here is a
short term plan, and it is listed as a type of action item under
changes to products.

>> They are also subject to veto if and only if they
>> are to be applied to the current release branch (or a released branch).
> 
> Owen intends to merge this patch to a release branch.

Right.

>> The compelling reason would be a measured performance impact or some
>> other objective degradation of the existing product that can be
>> evaluated by others as a cost/benefit tradeoff and perhaps compensated
>> by modifying the implementation.
> 
> Files written by the proposed new version would not be readable by older versions of
Hadoop.  An unaltered application that upgrades to the newer version would begin creating
files that could not be interchanged with folks running the older version.

Good reason, so let's fix that.  Note, however, that one valid solution is
to simply release it as a new major version of the product.

>> If a PMC member insists on making design opinion the sole basis of their
>> vetoes, then they are not collaborating with the rest of the PMC.  The
>> board will recommend that such a person be removed from the PMC so that
>> the majority can continue to develop the product in peace.
> 
> I am not the sole PMC member to express these opinions.

No, but the other objections seem to be suffering from a lack of independent
thought and are predicated on the theory that outside organizations will
satisfy the needs of our users instead of an Apache project solving them
directly.  That is extremely annoying to me, since the only reason I am here
is to deal with some folks' failure to think independently of their employer.

....Roy


Mime
View raw message