hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amol Kekre (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-2718) Job fails if AppMaster is killed
Date Thu, 21 Jul 2011 00:05:58 GMT
Job fails if AppMaster is killed
--------------------------------

                 Key: MAPREDUCE-2718
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2718
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: mrv2
            Reporter: Amol Kekre
             Fix For: 0.23.0


Started a cluster. Sumitted a sleep job with around 10000 maps and 1000 reduces.
when 5000 maps got completed, It killed AppMaster.
RM web UI Application as failed.
And jobclient after retry for 50 times -:
{
java.lang.reflect.UndeclaredThrowableException
        at
org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBClientImpl.java:161)
        at org.apache.hadoop.mapred.ClientServiceDelegate.getTaskCompletionEvents(ClientServiceDelegate.java:254)
        at org.apache.hadoop.mapred.YARNRunner.getTaskCompletionEvents(YARNRunner.java:520)
        at org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:540)
        at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1130)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1084)
        at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:259)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
        at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:191)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
        at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:111)
        at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:118)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:192)
Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: Call to /98.137.103.174:42557
failed on
connection exception: java.net.ConnectException: Connection refused
        at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:96)
        at $Proxy11.getTaskAttemptCompletionEvents(Unknown Source)
        at
org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBClientImpl.java:154)
        ... 21 more
Caused by: java.net.ConnectException: Call to /... failed on connection exception:
java.net.ConnectException: Connection refused
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1087)
        at org.apache.hadoop.ipc.Client.call(Client.java:1063)
        at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:250)
        at org.apache.hadoop.yarn.ipc.$Proxy10.call(Unknown Source)
        at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:94)
        ... 23 more
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:375)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:448)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:536)
        at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:211)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1196)
        at org.apache.hadoop.ipc.Client.call(Client.java:1040)
}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message