I noticed that you are using jdk 1.7 , personally I prefer 1.6.x ;
if your firewall is OK, you can check you RPC service to see if it is also OK; and test it by telnet  10.130.110.80 50020;
I suggested hive because HQL(SQL-like) is familiar to most people, and the learning curve is smooth;


2012/12/4 Haitao Yao <yao.erix@gmail.com>
The firewall is OK.  
Well, personally I prefer Pig. And it's a big project, switching pig to hive is not an easy way.
thanks.

Haitao Yao
weibo: @haitao_yao
Skype:  haitao.yao.final

On 2012-12-4, at 下午3:14, panfei <cnweike@gmail.com> wrote:

check your firewall settings plz.  and why not use hive to do work ?


2012/12/4 Haitao Yao <yao.erix@gmail.com>
hi, all
I's using Hadoop 1.2.0 , java version "1.7.0_05"
When running my pig script ,  the worker always report this error, and the MR jobs run very slow. 
Increase the dfs.socket.timeout value does not work. the network is ok, telnet to 50020 port is always ok.
here's the stacktrace: 
2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to /10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
	at $Proxy3.getProtocolVersion(Unknown Source)
	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
	at java.io.DataInputStream.read(DataInputStream.java:149)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
	at java.io.FilterInputStream.read(FilterInputStream.java:133)
	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
	at java.io.DataInputStream.readInt(DataInputStream.java:387)
	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)

I checked the source code, the exception happens here:
      //now wait for socket to be ready.
      int count = 0;
      try {
        count = selector.select(channel, ops, timeout);  
      } catch (IOException e) { //unexpected IOException.
        closed = true;
        throw e;
      } 

      if (count == 0) {
//here!!        throw new SocketTimeoutException(timeoutExceptionString(channel,
                                                                timeout, ops));
      }

Why the selector selected nothing? the data node is not under heavy load , gc, network are all ok.
Thanks.

Haitao Yao
weibo: @haitao_yao
Skype:  haitao.yao.final




--
不学习,不知道





--
不学习,不知道