hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ling Kun <lkun.e...@gmail.com>
Subject When and who move the reduce output file part-0000X to the final output directory
Date Fri, 10 May 2013 03:19:29 GMT
Dear all,

     I am looking into the MR work flow, and want to know more details
about the reduce output data copy .

    Here is my question.

   For the DFSIO test or some other MR jobs. Each reduce task will run on a
TT, and generate files to some dirs named like this:  "
XXX//_temporary/_attempt_201305101045_0005_r_000000_0/", there will also be
a result file named part-00000.

  After the reducer done the task. the reducer output data part-00000
should be moved from  the local disk to the HDFS.

My question is: Is that the time that when reducer finish the task that
part-00000 will be copied to the HDFS? Who make this file copy happen? The
Reducer child? The TaskTracker which run the reduce task? Or the JobTracker?


Kun Ling


View raw message