[ https://issues.apache.org/jira/browse/MAPREDUCE-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Nauroth reopened MAPREDUCE-5606:
--------------------------------------
Assignee: (was: firegun)
I'm reopening this. There is an actual bug here (holding a global lock in the JT while doing
I/O). Despite the config workaround I described, I don't think we can really call it resolved.
What I'm not sure about is if this is a duplicate of MAPREDUCE-1144. If anyone on that issue
can tell, then we can close this as duplicate.
> JobTracker blocked for DFSClient: Failed recovery attempt
> ---------------------------------------------------------
>
> Key: MAPREDUCE-5606
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5606
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: jobtracker
> Affects Versions: 1.0.3
> Environment: centos 5.8 jdk 1.7
> Reporter: firegun
> Priority: Critical
>
> when a datanode was crash,the server can ping ok,but can not call rpc ,and also can
not ssh login. and then jobTracker may be request a block on this datanode.
> it will happened ,the JobTracker can not work,the webUI is also unwork,hadoop job -list
also unwork,the jobTracker logs no other info .
> and then we need to restart the datanode.
> then jobTraker can work too,but the taskTracker num come to zero,
> we need run : hadoop mradmin -refreshNodes
> then the JobTracker begin to add taskTraker ,but is very slowly.
> this problem occur 5time in 2weeks.
--
This message was sent by Atlassian JIRA
(v6.1#6144)
|