hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Molina <rmol...@hortonworks.com>
Subject Re: Socket timeout for BlockReaderLocal
Date Tue, 04 Dec 2012 19:54:40 GMT
Hi Haitao,
To help isolate, what happens if you run a different job?  Also, if you
view the namenode webui or the specific datanode webui having the issue,
are there any indicators of it being down?

Regards,
Robert

On Tue, Dec 4, 2012 at 12:49 AM, panfei <cnweike@gmail.com> wrote:

> I noticed that you are using jdk 1.7 , personally I prefer 1.6.x ;
> if your firewall is OK, you can check you RPC service to see if it is also
> OK; and test it by telnet  10.130.110.80 50020;
> I suggested hive because HQL(SQL-like) is familiar to most people, and the
> learning curve is smooth;
>
>
> 2012/12/4 Haitao Yao <yao.erix@gmail.com>
>
>> The firewall is OK.
>> Well, personally I prefer Pig. And it's a big project, switching pig to
>> hive is not an easy way.
>> thanks.
>>
>>   Haitao Yao
>> yao.erix@gmail.com
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>>
>> On 2012-12-4, at 下午3:14, panfei <cnweike@gmail.com> wrote:
>>
>> check your firewall settings plz.  and why not use hive to do work ?
>>
>>
>> 2012/12/4 Haitao Yao <yao.erix@gmail.com>
>>
>>> hi, all
>>> I's using Hadoop 1.2.0 , java version "1.7.0_05"
>>>  When running my pig script ,  the worker always report this error, and
>>> the MR jobs run very slow.
>>> Increase the dfs.socket.timeout value does not work. the network is ok,
>>> telnet to 50020 port is always ok.
>>>  here's the stacktrace:
>>>
>>> 2012-12-04 14:29:41,323 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read
blk_-2337696885631113108_11054058 on local machinejava.net.SocketTimeoutException: Call to
/10.130.110.80:50020 failed on socket timeout exception: java.net.SocketTimeoutException:
10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected
local=/10.130.110.80:57689 remote=/10.130.110.80:50020]
>>> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1140)
>>> 	at org.apache.hadoop.ipc.Client.call(Client.java:1112)
>>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
>>> 	at $Proxy3.getProtocolVersion(Unknown Source)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:392)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:374)
>>> 	at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:212)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.getDatanodeProxy(BlockReaderLocal.java:90)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal$LocalDatanodeInfo.access$200(BlockReaderLocal.java:65)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.getBlockPathInfo(BlockReaderLocal.java:224)
>>> 	at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:145)
>>> 	at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:509)
>>> 	at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:78)
>>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2231)
>>> 	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2384)
>>> 	at java.io.DataInputStream.read(DataInputStream.java:149)
>>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>>> 	at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
>>> 	at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
>>> 	at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
>>> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
>>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
>>> 	at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
>>> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>> Caused by: java.net.SocketTimeoutException: 10000 millis timeout while waiting
for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.110.80:57689
remote=/10.130.110.80:50020]
>>> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>>> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>>> 	at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>> 	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:361)
>>> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>>> 	at java.io.DataInputStream.readInt(DataInputStream.java:387)
>>> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:841)
>>> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
>>>
>>>
>>> I checked the source code, the exception happens here:
>>>       //now wait for socket to be ready.
>>>       *int* count = 0;
>>>       *try* {
>>>         count = *selector*.select(channel, ops, timeout);
>>>       } *catch* (IOException e) { //unexpected IOException.
>>>         closed = *true*;
>>>         *throw* e;
>>>       }
>>>
>>>       *if* (count == 0) {
>>> //here!!        *throw* *new* SocketTimeoutException(*
>>> timeoutExceptionString*(channel,
>>>                                                                 timeout,
>>> ops));
>>>       }
>>>
>>> Why the selector selected nothing? the data node is not under heavy load
>>> , gc, network are all ok.
>>>  Thanks.
>>>
>>>   Haitao Yao
>>> yao.erix@gmail.com
>>> weibo: @haitao_yao
>>> Skype:  haitao.yao.final
>>>
>>>
>>
>>
>> --
>> 不学习,不知道
>>
>>
>>
>
>
> --
> 不学习,不知道
>
>

Mime
View raw message