hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-707) Final map task gets stuck
Date Fri, 17 Nov 2006 18:59:42 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-707?page=comments#action_12450855 ] 
            
Owen O'Malley commented on HADOOP-707:
--------------------------------------

Where is the task being lost? Use the web/ui to see if the JobTracker thinks the map is running
somewhere. If so, follow the link to the TaskTracker to see if the assigned TaskTracker thinks
it is still running. If so, look to see if the the process is still running.

Otherwise, look at the logs to determine where the task the task ran and look at the logs
on the task tracker's node to see what the task tracker thinks happened to it.

> Final map task gets stuck
> -------------------------
>
>                 Key: HADOOP-707
>                 URL: http://issues.apache.org/jira/browse/HADOOP-707
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.8.0
>         Environment: using latest trunk
>            Reporter: Johan Oskarson
>            Priority: Critical
>
> I've seen numerous jobs lately where the final map task gets stuck, never finishing.
> The jobtracker doesn't reassign the task. A restart of the tasktracker solves the issue
and the job can finish.
> In the web interface it turns up as:
> task_0028_m_000534_0 node17.herd1 RUNNING 0.00%    10-Nov-2006 12:21:12 10-Nov-2006 12:22:19
(1mins, 6sec)
> Task failed to report status for 604 seconds. Killing.
> Only exception I find in that tasktracker log is this (a few times):
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:125)
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294)
>         at org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(SocketChannelOutputStream.java:108)
>         at org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketChannelOutputStream.java:89)
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at java.io.DataOutputStream.flush(DataOutputStream.java:106)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:532)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message