hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Calvin <iphcal...@gmail.com>
Subject unexplained time between map 100% reduce 100% and job completion
Date Tue, 09 Sep 2014 22:33:39 GMT

I'm wondering what goes on between the time a mapreduce job has 100%
of its map and reduce tasks complete and the status change to a
completed job.

An example log:

14/09/09 08:14:26 INFO mapreduce.Job:  map 100% reduce 98%
14/09/09 08:14:47 INFO mapreduce.Job:  map 100% reduce 99%
14/09/09 08:15:25 INFO mapreduce.Job:  map 100% reduce 100%
14/09/09 08:23:22 INFO mapreduce.Job: Job job_1410269811222_0001
completed successfully

In this case, it takes about 8 minutes for it to output that the job
has completed successfully. My initial guess is that it's doing some
filesystem tasks, but I'm not doing any heavy writing via the reducers
(it's a simple wordcount variant):

File Input Format Counters Bytes Read=334624311758
File Output Format Counters Bytes Written=107785

Any ideas regarding this behavior or where I should look first?


View raw message