hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "bit1129@163.com" <bit1...@163.com>
Subject Re: Re: Where the output of mappers are saved ?
Date Tue, 16 Dec 2014 08:12:17 GMT
Thanks Susheel !, understood.



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-16 15:27
To: user
Subject: Re: Re: Where the output of mappers are saved ?
I don't think so. It will be a single output file per reducer.
 
If u want multiple small size output files then specify the number of
reducers in the job configuration.
 
On 12/16/14, bit1129@163.com <bit1129@163.com> wrote:
> Thanks Susheel!!
> One more question.. If  part-r-XXXX is extremely large,say, 2G, will the
> file be splitted into more files under the output directory,that is, one
> reducer could product more than one files.
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-16 14:17
> To: user
> Subject: Re: Re: Where the output of mappers are saved ?
> Yes, the map outputs will be cleaned on job completion.
>
> If u want to see the map outputs give number of reducers as zero
> and verify the files part-m-0000, part-m-0001....
>
> On 12/16/14, bit1129@163.com <bit1129@163.com> wrote:
>> Do they only exist during the map/reduce process and will be removed
>> after
>> the MR finished?
>>
>> When the reduce finished,I only see  part-m-0000, part-m-0001 ...., which
>> are reduce results.
>>
>>
>>
>> bit1129@163.com
>>
>> From: Susheel Kumar Gadalay
>> Date: 2014-12-16 13:05
>> To: user
>> Subject: Re: Where the output of mappers are saved ?
>> Map outputs will be in hdfs under your user name and output directory.
>>
>> They will have name like part-m-0000, part-m-0001 ....
>>
>>
>> On 12/16/14, Abdul Navaz <navaz.enc@gmail.com> wrote:
>>> Hello,
>>>
>>>
>>> Second Try !
>>>
>>>
>>> I  have created a directory to store this mapper output as below.
>>>  <property>
>>>  <name>mapred.local.dir</name>
>>>  <value>/app/hadoop/tmp/myoutput</value>
>>>  </property>
>>> and i looked at
>>>  hduser@dn4:/app/hadoop/tmp/myoutput$ ls -lrt
>>>  total 16
>>>  drwxr-xr-x 2 hduser hadoop 4096 Dec 12 10:50 tt_log_tmp
>>>  drwx------ 3 hduser hadoop 4096 Dec 12 10:53 ttprivate
>>>  drwxr-xr-x 3 hduser hadoop 4096 Dec 12 10:53 taskTracker
>>>  drwxr-xr-x 4 hduser hadoop 4096 Dec 12 13:25 userlogs
>>> and i couldnot find anything here when i run the map reduce job . Where
>>> by
>>> default mapper output is saved and how can I get the size of mapper
>>> output
>>> in bytes
>>>
>>>
>>> Thanks.
>>>
>>>
>>> From:  Abdul Navaz <navaz.enc@gmail.com>
>>> Date:  Friday, December 12, 2014 at 12:36 AM
>>> To:  "user@hadoop.apache.org" <user@hadoop.apache.org>
>>> Subject:  Where the output of mappers are saved ?
>>>
>>> Hello,
>>>
>>>
>>> I am interested in efficiently manage the Hadoop shuffling traffic and
>>> utilize the network bandwidth effectively. To do this I want to know how
>>> much shuffling traffic generated by each Datanodes ? Shuffling traffic
>>> is
>>> nothing but the output of mappers. So where this mapper output is saved
>>> ?
>>> How can i get the size of mapper output from each datanodes in a real
>>> time
>>> ?
>>> Appreciate your help.
>>>
>>> Thanks & Regards,
>>>
>>> Abdul Navaz
>>>
>>>
>>>
>>>
>>
>
Mime
View raw message