hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-610) Task Tracker offerService does not adequately protect from exceptions
Date Wed, 18 Oct 2006 04:15:34 GMT
Task Tracker offerService does not adequately protect from exceptions
---------------------------------------------------------------------

                 Key: HADOOP-610
                 URL: http://issues.apache.org/jira/browse/HADOOP-610
             Project: Hadoop
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.7.1
            Reporter: Owen O'Malley
         Assigned To: Owen O'Malley
             Fix For: 0.8.0


The TaskTracker's offerService loop doesn't handle exceptions, such as time outs well and
will reset the task tracker. I believe this is the cause of most of the lost task trackers.
The scenario looks like:

  1. an rpc timeout in offerService
  2. the task tracker cleans up (which takes 30 minutes with the task tracker locked up)
  3. the task tracker is declared lost for not providing its heartbeat

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message