hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Mitic (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-9970) TaskTracker hung after failed reconnect to the JobTracker
Date Tue, 17 Sep 2013 01:53:52 GMT
Ivan Mitic created HADOOP-9970:

             Summary: TaskTracker hung after failed reconnect to the JobTracker
                 Key: HADOOP-9970
                 URL: https://issues.apache.org/jira/browse/HADOOP-9970
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 1.3.0
            Reporter: Ivan Mitic
            Assignee: Ivan Mitic

TaskTracker hung after failed reconnect to the JobTracker. 

This is the problematic piece of code:
    this.distributedCacheManager = new TrackerDistributedCacheManager(
        this.fConf, taskController);
    this.jobClient = (InterTrackerProtocol) 
        new PrivilegedExceptionAction<Object>() {
      public Object run() throws IOException {
        return RPC.waitForProxy(InterTrackerProtocol.class,
            jobTrackAddr, fConf);

In case RPC.waitForProxy() throws, TrackerDistributedCacheManager cleanup thread will never
be stopped, and given that it is a non daemon thread it will keep TT up forever.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message