hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christian Kunz (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3462) reduce task failures during shuffling should not count against number of retry attempts
Date Thu, 29 May 2008 06:50:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12600695#action_12600695
] 

Christian Kunz commented on HADOOP-3462:
----------------------------------------

No, not merge failures.
What I see is:

Task task_200804260028_0027_r_000093_12 failed to report status for 1214 seconds. Killing!

or

2008-05-26 04:11:22,354 INFO org.apache.hadoop.mapred.TaskRunner: Communication exception:
java.net.SocketTimeoutException: timed out waiting for rpc response
        at org.apache.hadoop.ipc.Client.call(Client.java:514)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:198)
        at org.apache.hadoop.mapred.$Proxy0.statusUpdate(Unknown Source)
        at org.apache.hadoop.mapred.Task$1.run(Task.java:294)
        at java.lang.Thread.run(Thread.java:619)

or


2008-05-28 09:26:51,105 ERROR org.apache.hadoop.mapred.ReduceTask: Map output copy failure:
java.lang.IllegalStateException: Shutdown in progress
        at java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:39)
        at java.lang.Runtime.addShutdownHook(Runtime.java:192)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1195)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:150)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
        at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:763)
        at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:696)

Maybe some other kinds of errors.

What is common that the failures occurred on nodes with disk errors.

> reduce task failures during shuffling should not count against number of retry attempts
> ---------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3462
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3462
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.3
>            Reporter: Christian Kunz
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message