hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mithun Radhakrishnan <mithun.radhakrish...@yahoo.com>
Subject Re: DistCpV2 in 0.23
Date Fri, 26 Aug 2011 12:45:10 GMT
Would it be acceptable if retooling of tools/ were taken up separately? It sounds to me like
this might be a distinct (albeit related) task.

Mithun


________________________________
From: Giridharan Kesavan <gkesavan@hortonworks.com>
To: mapreduce-dev@hadoop.apache.org
Sent: Friday, August 26, 2011 12:04 PM
Subject: Re: DistCpV2 in 0.23

+1 to Alejandro's

I prefer to keep the hadoop-tools at trunk level.

-Giri

On Thu, Aug 25, 2011 at 9:15 PM, Alejandro Abdelnur <tucu@cloudera.com> wrote:
> I'd suggest putting hadoop-tools either at trunk/ level or having a a tools
> aggregator module for hdfs and other for common.
>
> I personal would prefer at trunk/.
>
> Thanks.
>
> Alejandro
>
> On Thu, Aug 25, 2011 at 9:06 PM, Amareshwari Sri Ramadasu <
> amarsri@yahoo-inc.com> wrote:
>
>> Agree. It should be separate maven module (and patch puts it as separate
>> maven module now). And top level for hadoop tools is nice to have, but it
>> becomes hard to maintain until patch automation tests run the tests under
>> tools. Currently we see many times the changes in HDFS effecting RAID tests
>> in MapReduce. So, I'm fine putting the tools under hadoop-mapreduce.
>>
>> I propose we can have something like the following:
>>
>> trunk/
>>  - hadoop-mapreduce
>>      - hadoop-mr-client
>>      - hadoop-yarn
>>      - hadoop-tools
>>          - hadoop-streaming
>>          - hadoop-archives
>>          - hadoop-distcp
>>
>> Thoughts?
>>
>> @Eli and @JD, we did not replace old legacy distcp because this is really a
>> complete rewrite and did not want to remove it until users are familiarized
>> with new one.
>>
>> On 8/26/11 12:51 AM, "Todd Lipcon" <todd@cloudera.com> wrote:
>>
>> Maybe a separate toplevel for hadoop-tools? Stuff like RAID could go
>> in there as well - ie tools that are downstream of MR and/or HDFS.
>>
>> On Thu, Aug 25, 2011 at 12:09 PM, Mahadev Konar <mahadev@hortonworks.com>
>> wrote:
>> > +1 for a seperate module in hadoop-mapreduce-project. I think
>> > hadoop-mapreduce-client might not be right place for it. We might have
>> > to pick a new maven module under hadoop-mapreduce-project that could
>> > host streaming/distcp/hadoop archives.
>> >
>> > thanks
>> > mahadev
>> >
>> > On Thu, Aug 25, 2011 at 11:04 AM, Alejandro Abdelnur <tucu@cloudera.com>
>> wrote:
>> >> Agree, it should be a separate maven module.
>> >>
>> >> And it should be under hadoop-mapreduce-client, right?
>> >>
>> >> And now that we are in the topic, the same should go for streaming, no?
>> >>
>> >> Thanks.
>> >>
>> >> Alejandro
>> >>
>> >> On Thu, Aug 25, 2011 at 10:58 AM, Todd Lipcon <todd@cloudera.com>
>> wrote:
>> >>
>> >>> On Thu, Aug 25, 2011 at 10:36 AM, Eli Collins <eli@cloudera.com>
>> wrote:
>> >>> > Nice work!   I definitely think this should go in 23 and 20x.
>> >>> >
>> >>> > Agree with JD that it should be in the core code, not contrib.
 If
>> >>> > it's going to be maintained then we should put it in the core code.
>> >>>
>> >>> Now that we're all mavenized, though, a separate maven module and
>> >>> artifact does make sense IMO - ie "hadoop jar
>> >>> hadoop-distcp-0.23.0-SNAPSHOT" rather than "hadoop distcp"
>> >>>
>> >>> -Todd
>> >>> --
>> >>> Todd Lipcon
>> >>> Software Engineer, Cloudera
>> >>>
>> >>
>> >
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>>
>



-- 
-Giri
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message