hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mayank Bansal (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4164) Hadoop 22 Exception thrown after task completion causes its reexecution
Date Wed, 18 Apr 2012 00:09:13 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256089#comment-13256089
] 

Mayank Bansal commented on MAPREDUCE-4164:
------------------------------------------

1. TaskReporter thread sends status updates/pings periodically to TaskTracker. If it needs
to send the task progress, it sends STATUS_UPDATE message
to TaskTracker. Otherwise, it sends a PING signal to check if the TaskTracker is alive.

2. When the map/reduce phase is over, it calls stopCommunicationThread() which interrupts
ping/statusupdate thread.

3. If the system was trying to communicate with the server at the time of interrupts, it breaks
the connection to the
server.Since the interrupt was issued, the stream throws ClosedByInterruptException.

5. However in Client.java, Client keeps waiting for the response and it basically times out
and re-throws this exception.

                
> Hadoop 22 Exception thrown after task completion causes its reexecution
> -----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4164
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4164
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>         Attachments: MAPREDUCE-4164.patch
>
>
> 2012-02-28 19:17:08,504 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass,
with 3 segments left of total size: 1969310 bytes
> 2012-02-28 19:17:08,694 INFO org.apache.hadoop.mapred.Task: Task:attempt_201202272306_0794_m_000094_0
is done. And is in the process of commiting
> 2012-02-28 19:18:08,774 INFO org.apache.hadoop.mapred.Task: Communication exception:
java.io.IOException: Call to /127.0.0.1:35400 failed on local exception: java.nio.channels.ClosedByInterruptException
> at org.apache.hadoop.ipc.Client.wrapException(Client.java:1094)
> at org.apache.hadoop.ipc.Client.call(Client.java:1062)
> at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
> at $Proxy0.statusUpdate(Unknown Source)
> at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:650)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.nio.channels.ClosedByInterruptException
> at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
> at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:60)
> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
> at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:151)
> at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:112)
> at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> at java.io.DataOutputStream.flush(DataOutputStream.java:106)
> at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:769)
> at org.apache.hadoop.ipc.Client.call(Client.java:1040)
> ... 4 more
> 2012-02-28 19:18:08,825 INFO org.apache.hadoop.mapred.Task: Task 'attempt_201202272306_0794_m_000094_0'
done.
> ================>>>>>> SHOULD be <++++++++++++++
> 2012-02-28 19:17:02,214 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass,
with 3 segments left of total size: 1974104 bytes
> 2012-02-28 19:17:02,408 INFO org.apache.hadoop.mapred.Task: Task:attempt_201202272306_0794_m_000000_0
is done. And is in the process of commiting
> 2012-02-28 19:17:02,519 INFO org.apache.hadoop.mapred.Task: Task 'attempt_201202272306_0794_m_000000_0'
done. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message