hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jiang licht <licht_ji...@yahoo.com>
Subject Hadoop freeze?
Date Thu, 25 Feb 2010 20:17:43 GMT
I ran into the following problem running a hadoop job written in pig.Pls help check what caused
the issue. As I could tell, it seems to me the job/task tracker failed for some reason but

name/data nodes still functioning. 

The job simply seems to make no progress at all (no output, no log). But couple of other hadoop
jobs ran successfully before this one. hadoop fs -ls can still list files. But I did "Hadoop
job -list", it took too long and then failed with error message as follows.

Exception in thread "main" java.io.IOException: Call to hostname/ip-address:50002 failed on
 local exception: Connection reset by peer	at 
org.apache.hadoop.ipc.Client.call(Client.java:699)	at 
org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)	at 
org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown Source)	at 
org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)	at 
org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:435)	at 
org.apache.hadoop.mapred.JobClient.init(JobClient.java:429)	at 
org.apache.hadoop.mapred.JobClient.run(JobClient.java:1512)	at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)	at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)	at 
org.apache.hadoop.mapred.JobClient.main(JobClient.java:1727)Caused
 by: java.io.IOException: Connection reset by peer	at 
sun.nio.ch.FileDispatcher.read0(Native Method)	at 
sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)	at 
sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)	at 
sun.nio.ch.IOUtil.read(IOUtil.java:206)	at 
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)	at 
org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)	at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:140)	at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)	at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)	at 
java.io.FilterInputStream.read(FilterInputStream.java:116)	at 
org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:271)	at 
java.io.BufferedInputStream.fill(BufferedInputStream.java:218)	at 
java.io.BufferedInputStream.read(BufferedInputStream.java:237)	at 
java.io.DataInputStream.readInt(DataInputStream.java:370)	at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:493)	at 
org.apache.hadoop.ipc.Client$Connection.run(Client.java:438)
Web interface to job tracker@50030 simply came with no response at all.

By checking netstat, sometimes it shows 50030 and sometimes not. connections and ports with
data nodes were shown there.

Then, if I ran another pig, it failed with the following error:

Error before Pig is launched----------------------------ERROR
 6009: Failed to create job client:Call to hostname/ip-address:50002 failed on
 local exception: Connection reset by peer
org.apache.pig.backend.executionengine.ExecException:
 ERROR 6009: Failed to create job client:Call to hostname/ip-address:50002 failed on
 local exception: Connection reset by peer	at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:217)
at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:137)
at 
org.apache.pig.impl.PigContext.connect(PigContext.java:199)	at 
org.apache.pig.PigServer.<init>(PigServer.java:169)	at 
org.apache.pig.PigServer.<init>(PigServer.java:158)	at 
org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:54)	at 
org.apache.pig.Main.main(Main.java:395)Caused by: 
java.io.IOException: Call to hostname/ip-address:50002 failed on
 local exception: Connection reset by peer	at 
org.apache.hadoop.ipc.Client.call(Client.java:699)	at 
org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)	at 
org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown Source)	at 
org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)	at 
org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:435)	at 
org.apache.hadoop.mapred.JobClient.init(JobClient.java:429)	at 
org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:398)	at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:212)
... 6 moreCaused
 by: java.io.IOException: Connection reset by peer	at 
sun.nio.ch.FileDispatcher.read0(Native Method)	at 
sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)	at 
sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)	at 
sun.nio.ch.IOUtil.read(IOUtil.java:206)	at 
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)	at 
org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)	at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:140)	at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)	at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)	at 
java.io.FilterInputStream.read(FilterInputStream.java:116)	at 
org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:271)	at 
java.io.BufferedInputStream.fill(BufferedInputStream.java:218)	at 
java.io.BufferedInputStream.read(BufferedInputStream.java:237)	at 
java.io.DataInputStream.readInt(DataInputStream.java:370)	at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:493)	at 
org.apache.hadoop.ipc.Client$Connection.run(Client.java:438)================================================================================

Thank,

Michael


      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message