hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <eric...@gmail.com>
Subject Re: Hadoop Tools Layout (was Re: DistCpV2 in 0.23)
Date Wed, 07 Sep 2011 17:50:27 GMT
Mapreduce and HDFS are distinct function of Hadoop.  They are loosely
coupled.  If we have tools aggregator module, it will not have as
clear distinct function as other Hadoop modules.  Hence, it is
possible for a tool to be depend on both HDFS and map reduce.  If
something broke in tools module, it is unclear which subproject's
responsibility to maintain tools function.  Therefore, it is safer to
send tools to incubator or apache extra rather than deposit the
utility tools in tools subcategory.  There are many short lived
projects that attempts to associate themselves with Hadoop but not
being maintained.  It would be better to spin off those utility
projects than use Hadoop as a dumping ground.

The previous discussion for removing contrib, most people were in
favor of doing so, and only a few contrib owners were reluctant to
remove contrib.  Fewer people has participated in restore
functionality of broken contrib projects.  History speaks for itself.
-1 (non-binding) for hadoop-tools.

regards,
Eric

On Tue, Sep 6, 2011 at 6:55 PM, Alejandro Abdelnur <tucu@cloudera.com> wrote:
> Eric,
>
> Personally I'm fine either way.
>
> Still, I fail to see why a generic/categorized tools increase/reduce the
> risk of dead code and how they make more-difficult/easier the
> package&deployment.
>
> Would you please explain this?
>
> Thanks.
>
> Alejandro
>
> On Tue, Sep 6, 2011 at 6:38 PM, Eric Yang <eric818@gmail.com> wrote:
>
>> Option #2 proposed by Amareshwari, seems like a better proposal.  We don't
>> want to repeat history for contrib again with hadoop-tools.  Having a
>> generic module like hadoop-tools increases the risk of accumulate dead code.
>>  It would be better to categorize the hdfs or mapreduce specific tools in
>> their respected subcategories.  It is also easier to manage from
>> package/deployment prospective.
>>
>> regards,
>> Eric
>>
>> On Sep 6, 2011, at 4:32 PM, Eli Collins wrote:
>>
>> > On Tue, Sep 6, 2011 at 10:11 AM, Allen Wittenauer <aw@apache.org> wrote:
>> >>
>> >> On Sep 6, 2011, at 9:30 AM, Vinod Kumar Vavilapalli wrote:
>> >>> We still need to answer Amareshwari's question (2) she asked some time
>> back
>> >>> about the automated code compilation and test execution of the tools
>> module.
>> >>
>> >>
>> >>
>> >>>>> My #1 question is if tools is basically contrib reborn.  If
not, what
>> >>>> makes
>> >>>>> it different?
>> >>
>> >>
>> >>        I'm still waiting for this answer as well.
>> >>
>> >>        Until such, I would be pretty much against a tools module.
>>  Changing the name of the dumping ground doesn't make it any less of a
>> dumping ground.
>> >
>> > IMO if the tools module only gets stuff like distcp that's maintained
>> > then it's not contrib, if it contains all the stuff from the current
>> > MR contrib then tools is just a re-labeling of contrib. Given that
>> > this proposal only covers moving distcp to tools it doesn't sound like
>> > contrib to me.
>> >
>> > Thanks,
>> > Eli
>>
>>
>

Mime
View raw message