hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-4813) AM timing out during job commit
Date Wed, 21 Nov 2012 15:32:05 GMT
Jason Lowe created MAPREDUCE-4813:

             Summary: AM timing out during job commit
                 Key: MAPREDUCE-4813
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4813
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: applicationmaster
    Affects Versions: 2.0.1-alpha, 0.23.3
            Reporter: Jason Lowe
            Priority: Critical

The AM calls the output committer's {{commitJob}} method synchronously during JobImpl state
transitions, which means the JobImpl write lock is held the entire time the job is being committed.
 Holding the write lock prevents the RM allocator thread from heartbeating to the RM.  Therefore
if committing the job takes too long (e.g.: the job has tons of files to commit and/or the
namenode is bogged down) then the AM appears to be unresponsive to the RM and the RM kills
the AM attempt.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message