Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: mapreduce-issues@hadoop.apache.org
Date: Tue, 19 Nov 2013 19:27:26 +0000 (UTC)
From: "Chris Nauroth (JIRA)" <jira@apache.org>
To: mapreduce-issues@hadoop.apache.org
Message-ID: <JIRA.12677498.1383619683553.100096.1384889246397@arcas>
In-Reply-To: <JIRA.12677498.1383619683553@arcas>
References: <JIRA.12677498.1383619683553@arcas>
Subject: [jira] [Reopened] (MAPREDUCE-5606) JobTracker blocked for
 DFSClient: Failed recovery attempt
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/MAPREDUCE-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Nauroth reopened MAPREDUCE-5606:
--------------------------------------

      Assignee:     (was: firegun)

I'm reopening this.  There is an actual bug here (holding a global lock in the JT while doing I/O).  Despite the config workaround I described, I don't think we can really call it resolved.

What I'm not sure about is if this is a duplicate of MAPREDUCE-1144.  If anyone on that issue can tell, then we can close this as duplicate.

> JobTracker blocked for DFSClient: Failed recovery attempt
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-5606
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5606
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>    Affects Versions: 1.0.3
>         Environment: centos 5.8  jdk 1.7 
>            Reporter: firegun
>            Priority: Critical
>
> when a  datanode was crash,the server can  ping ok,but can not  call rpc ,and also can not ssh login. and then jobTracker may be request a block on this datanode.
> it will happened ,the  JobTracker can not work,the webUI is also unwork,hadoop job -list also unwork,the jobTracker logs no other info .
> and then we need to restart the datanode.
> then jobTraker can work too,but the taskTracker num come to zero,
> we need run : hadoop mradmin -refreshNodes
> then the JobTracker begin to add taskTraker ,but is very slowly.
> this problem occur 5time  in 2weeks.


--
This message was sent by Atlassian JIRA
(v6.1#6144)