hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dick King (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1967) When a reducer fails on DFS quota, the job should fail immediately
Date Sat, 24 Jul 2010 00:51:50 GMT
When a reducer fails on DFS quota, the job should fail immediately
------------------------------------------------------------------

                 Key: MAPREDUCE-1967
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1967
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
            Reporter: Dick King


Suppose an M/R job has so much output that the user is certain to exceed hir quota.  Then
some of the reducers will succeed but the job will get into a state where the remaining reducers
squabble over the remaining space.  The remaining reducers will nibble at the remaining space,
and finally one reducer will fail on quota.  Its output file will be erased, and the other
reducers will collectively consume that space until one of _them_ fails on quota.  Since the
incomplete reducer that fails on quota is "chosen" randomly, the tasks will accumulate their
failures at similar rates, and the system will have made a substantial futile investment.

I would like to say that if a single reducer fails on DFS quota, the job should be failed.
 There may be a corner case that induces us to think that we shouldn't be quite this stringent,
but at least we shouldn't have to await four failures by one task before shutting the job
down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message