hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alejandro Abdelnur <t...@cloudera.com>
Subject Re: Hadoop Tools Layout (was Re: DistCpV2 in 0.23)
Date Wed, 07 Sep 2011 18:18:28 GMT
Agreed, we should not have a dumping ground. IMO, what it would go into
hadoop-tools (i.e. distcp, streaming and someone could argue for FsShell as
well) are effectively hadoop CLI utilities. Having them in a separate module
rather in than in the core module (common, hdfs, mapreduce) does not mean
that they are secondary things, just modularization. Also it will help to
get those tools to use public interfaces of the core module, and when we
finally have a clean hadoop-client layer, those tools should only depend on
that.

Finally, the fact that tools would end up under trunk/hadoop-tools, it does
not prevent that the packaging from HDFS and MAPREDUCE to bundle the
same/different tools

+1 for hadoop-tools/ (not binding)

Thanks.


On Wed, Sep 7, 2011 at 10:50 AM, Eric Yang <eric818@gmail.com> wrote:

> Mapreduce and HDFS are distinct function of Hadoop.  They are loosely
> coupled.  If we have tools aggregator module, it will not have as
> clear distinct function as other Hadoop modules.  Hence, it is
> possible for a tool to be depend on both HDFS and map reduce.  If
> something broke in tools module, it is unclear which subproject's
> responsibility to maintain tools function.  Therefore, it is safer to
> send tools to incubator or apache extra rather than deposit the
> utility tools in tools subcategory.  There are many short lived
> projects that attempts to associate themselves with Hadoop but not
> being maintained.  It would be better to spin off those utility
> projects than use Hadoop as a dumping ground.
>
> The previous discussion for removing contrib, most people were in
> favor of doing so, and only a few contrib owners were reluctant to
> remove contrib.  Fewer people has participated in restore
> functionality of broken contrib projects.  History speaks for itself.
> -1 (non-binding) for hadoop-tools.
>
> regards,
> Eric
>
> On Tue, Sep 6, 2011 at 6:55 PM, Alejandro Abdelnur <tucu@cloudera.com>
> wrote:
> > Eric,
> >
> > Personally I'm fine either way.
> >
> > Still, I fail to see why a generic/categorized tools increase/reduce the
> > risk of dead code and how they make more-difficult/easier the
> > package&deployment.
> >
> > Would you please explain this?
> >
> > Thanks.
> >
> > Alejandro
> >
> > On Tue, Sep 6, 2011 at 6:38 PM, Eric Yang <eric818@gmail.com> wrote:
> >
> >> Option #2 proposed by Amareshwari, seems like a better proposal.  We
> don't
> >> want to repeat history for contrib again with hadoop-tools.  Having a
> >> generic module like hadoop-tools increases the risk of accumulate dead
> code.
> >>  It would be better to categorize the hdfs or mapreduce specific tools
> in
> >> their respected subcategories.  It is also easier to manage from
> >> package/deployment prospective.
> >>
> >> regards,
> >> Eric
> >>
> >> On Sep 6, 2011, at 4:32 PM, Eli Collins wrote:
> >>
> >> > On Tue, Sep 6, 2011 at 10:11 AM, Allen Wittenauer <aw@apache.org>
> wrote:
> >> >>
> >> >> On Sep 6, 2011, at 9:30 AM, Vinod Kumar Vavilapalli wrote:
> >> >>> We still need to answer Amareshwari's question (2) she asked some
> time
> >> back
> >> >>> about the automated code compilation and test execution of the
tools
> >> module.
> >> >>
> >> >>
> >> >>
> >> >>>>> My #1 question is if tools is basically contrib reborn.
 If not,
> what
> >> >>>> makes
> >> >>>>> it different?
> >> >>
> >> >>
> >> >>        I'm still waiting for this answer as well.
> >> >>
> >> >>        Until such, I would be pretty much against a tools module.
> >>  Changing the name of the dumping ground doesn't make it any less of a
> >> dumping ground.
> >> >
> >> > IMO if the tools module only gets stuff like distcp that's maintained
> >> > then it's not contrib, if it contains all the stuff from the current
> >> > MR contrib then tools is just a re-labeling of contrib. Given that
> >> > this proposal only covers moving distcp to tools it doesn't sound like
> >> > contrib to me.
> >> >
> >> > Thanks,
> >> > Eli
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message