hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ling Kun <lkun.e...@gmail.com>
Subject When and who move the reduce output file part-0000X to the final output directory
Date Fri, 10 May 2013 03:19:29 GMT
Dear all,

     I am looking into the MR work flow, and want to know more details
about the reduce output data copy .

    Here is my question.

   For the DFSIO test or some other MR jobs. Each reduce task will run on a
TT, and generate files to some dirs named like this:  "
XXX//_temporary/_attempt_201305101045_0005_r_000000_0/", there will also be
a result file named part-00000.

  After the reducer done the task. the reducer output data part-00000
should be moved from  the local disk to the HDFS.

My question is: Is that the time that when reducer finish the task that
part-00000 will be copied to the HDFS? Who make this file copy happen? The
Reducer child? The TaskTracker which run the reduce task? Or the JobTracker?

Thanks,

yours,
Kun Ling

-- 
http://www.lingcc.com

Mime
View raw message