hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Collins <...@cloudera.com>
Subject Re: [DISCUSSION] Combine committer lists in Common/HDFS/MapReduce
Date Tue, 10 Aug 2010 17:24:53 GMT
On Mon, Aug 9, 2010 at 9:26 AM, Doug Cutting <cutting@apache.org> wrote:
> On 08/08/2010 12:21 PM, Arun C Murthy wrote:
>>
>> This of course begs a larger question - should we just merge Common,
>> HDFS & Map-Reduce together and be done with?
>
> I think there's still a reasonable long-term goal to split MapReduce from
> HDFS, so that they can release separately and are maintained by separate
> teams.  So I believe a strong division of these code trees and release
> artifacts should remain.
>
> I'd like to get rid of Common.  It could either be merged into HDFS or
> gradually whittled away to nothing.  I'd prefer the latter.  If we move to
> different RPC and serialization systems (e.g., Avro) then Common's io, and
> ipc packages might be removed.  Configuration might be replaced/merged with
> Jakarta Commons Configuration (http://commons.apache.org/configuration/).
>  Similarly, the metrics and fs packages might be moved to Jakarta Commons.
>  Such changes might be hard to do back-compatibly, however.

Merging the o.a.h.fs back into the hdfs repo would be helpful.  It's a
pain to develop a file system with client and server split into
multiple repositories, and the other fs implementations probably do
not want their own repository since they need to get updated when the
clients change as well.

Developing and releasing hadoop post-project split has been a pain,
going back to two repos (merging common and hdfs) or a single repo
would make the life of people developing and releasing easier.  As you
point out, users want to consume hadoop as a single project and not
worry about common, mr and hdfs as separately released and versioned
components, so I'm not sure which community the split is serving.

Thanks,
Eli



>
> I don't see that merging the Jira databases or mailing lists for HDFS and
> MapReduce offers big advantages.  The redundant, coordinated Jira's tend to
> be between Common the others, no?
>
> Doug
>

Mime
View raw message