hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Azuryy Yu <azury...@gmail.com>
Subject Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010
Date Mon, 11 Mar 2013 02:23:34 GMT
xcievers : 4096 is enough, and I don't think you pasted a full stack
exception.
Socket is ready for receiving, but client closed abnormally. so you
generally got this error.


On Mon, Mar 11, 2013 at 2:33 AM, Pablo Musa <pablo@psafe.com> wrote:

>  This variable was already set:
> <property>
>   <name>dfs.datanode.max.xcievers</name>
>   <value>4096</value>
>   <final>true</final>
> </property>
>
> Should I increase it more?
>
> Same error happening every 5-8 minutes in the datanode 172.17.2.18.
>
> 2013-03-10 15:26:42,818 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation  src: /
> 172.17.2.18:46422 dest: /172.17.2.18:50010
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010remote=/
> 172.17.2.18:46422]
>
>
> ]$ lsof | wc -l
> 2393
>
> ]$ lsof | grep hbase | wc -l
> 4
>
> ]$ lsof | grep hdfs | wc -l
> 322
>
> ]$ lsof | grep hadoop | wc -l
> 162
>
> ]$ cat /proc/sys/fs/file-nr
> 4416    0    7327615
>
> ]$ date
> Sun Mar 10 15:31:47 BRT 2013
>
>
> What can be the causes? How could I extract more info about the error?
>
> Thanks,
> Pablo
>
>
>  On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
>
> Hi,
>
>  If all of the # of open files limit ( hbase , and hdfs : users ) are set
> to more than 30 K. Please change the dfs.datanode.max.xcievers to more than
> the value below.
>
> <property>
>
>    <name>dfs.datanode.max.xcievers</name>
>
>    <value>2096</value>
>
>        <description>PRIVATE CONFIG VARIABLE</description>
>
>              </property>
>
> Try to increase this one and tunne it to the hbase usage.
>
>
>  Thanks
>
> -Abdelrahman
>
>
>
>
>
>
> On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <pablo@psafe.com> wrote:
>
>>  I am also having this issue and tried a lot of solutions, but could not
>> solve it.
>>
>> ]# ulimit -n ** running as root and hdfs (datanode user)
>> 32768
>>
>> ]# cat /proc/sys/fs/file-nr
>> 2080    0    8047008
>>
>> ]# lsof | wc -l
>> 5157
>>
>> Sometimes this issue happens from one node to the same node :(
>>
>> I also think this issue is messing with my regionservers which are
>> crashing all day long!!
>>
>> Thanks,
>> Pablo
>>
>>
>> On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>
>> Hi Varun
>>
>>  I believe is not ulimit issue.
>>
>>
>>  /etc/security/limits.conf
>>  # End of file
>> *               -      nofile          1000000
>> *               -      nproc           1000000
>>
>>
>>  please guide me Guys, I want fix this. share your thoughts DataXceiver
>> error.
>>
>> Did I learn something today? If not, I wasted it.
>>
>>
>> On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <varun.uid@gmail.com> wrote:
>>
>>> Hi Dhana,
>>>
>>>  Increase the ulimit for all the datanodes.
>>>
>>>  If you are starting the service using hadoop increase the ulimit value
>>> for hadoop user.
>>>
>>>  Do the  changes in the following file.
>>>
>>>  */etc/security/limits.conf*
>>>
>>>  Example:-
>>> *hadoop          soft    nofile          35000*
>>> *hadoop          hard    nofile          35000*
>>>
>>>  Regards,
>>> Varun Kumar.P
>>>
>>>  On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan <
>>> bugcy013@gmail.com> wrote:
>>>
>>>>   Hi Guys
>>>>
>>>>  I am frequently getting is error in my Data nodes.
>>>>
>>>>  Please guide what is the exact problem this.
>>>>
>>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation
src: /172.16.30.138:50373 dest: /172.16.30.138:50010
>>>>
>>>>
>>>>
>>>> java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel
to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280
remote=/172.16.30.140:50010]
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:127)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:115)
>>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>> at java.io.FilterInputStream.read(FilterInputStream.java:66)
>>>> at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:160)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:405)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>> at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>  dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation
src: /172.16.30.138:50531 dest: /172.16.30.138:50010
>>>>
>>>>
>>>>
>>>> java.io.EOFException: while trying to read 65563 bytes
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:408)
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:452)
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:511)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:462)
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66)
>>>> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
>>>> at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>
>>>>
>>>>  How to resolve this.
>>>>
>>>>  -Dhanasekaran.
>>>>
>>>>  Did I learn something today? If not, I wasted it.
>>>>
>>>>    --
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>  --
>>> Regards,
>>> Varun Kumar.P
>>>
>>
>>
>>
>
>

Mime
View raw message