hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1521) Protection against incorrectly configured reduces
Date Mon, 22 Feb 2010 17:44:30 GMT
Protection against incorrectly configured reduces
-------------------------------------------------

                 Key: MAPREDUCE-1521
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1521
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: jobtracker
            Reporter: Arun C Murthy
            Assignee: Arun C Murthy
             Fix For: 0.22.0


We've seen a fair number of instances where naive users process huge data-sets (>10TB)
with badly mis-configured #reduces e.g. 1 reduce.

This is a significant problem on large clusters since it takes each attempt of the reduce
a long time to shuffle and then run into problems such as local disk-space etc. Then it takes
4 such attempts.

Proposal: Come up with heuristics/configs to fail such jobs early. 

Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message