hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghu Angadi <rang...@yahoo-inc.com>
Subject Re: DataXceiver Errors in 0.19.1
Date Mon, 13 Apr 2009 16:20:27 GMT

It need not be anything to worry about. Do you see anything at user 
level (task, job, copy, or script) fail because of this?

On a distributed system with many nodes, there would be some errors on 
some of the nodes for various reasons (load, hardware, reboot, etc). 
HDFS usually should work around it (because of multiple replicas).

In this particular case, client is trying to write some data and one of 
the DataNodes writing a replica might have gone down. HDFS should 
recover from it and write to rest of the nodes. Please check if the 
write actually succeeded.

Raghu.

Tamir Kamara wrote:
> Hi,
> 
> I've recently upgraded to 0.19.1 and now there're some DataXceiver errors in
> the datanodes logs. There're also messages about interruption while waiting
> for IO. Both messages are below.
> Can I do something to fix it ?
> 
> Thanks,
> Tamir
> 
> 
> 2009-04-13 09:57:20,334 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(192.168.14.3:50010,
> storageID=DS-727246419-127.0.0.1-50010-1234873914501, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.io.EOFException: while trying to read 65557 bytes
> 	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:264)
> 	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:308)
> 	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:372)
> 	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:524)
> 	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357)
> 	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
> 	at java.lang.Thread.run(Unknown Source)
> 
> 
> 2009-04-13 09:57:20,333 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder
> blk_8486030874928774495_54856 1 Exception
> java.io.InterruptedIOException: Interruped while waiting for IO on
> channel java.nio.channels.SocketChannel[connected
> local=/192.168.14.3:50439 remote=/192.168.14.7:50010]. 58972 millis
> timeout left.
> 	at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:277)
> 	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
> 	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
> 	at java.io.DataInputStream.readFully(Unknown Source)
> 	at java.io.DataInputStream.readLong(Unknown Source)
> 	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:853)
> 	at java.lang.Thread.run(Unknown Source)
> 


Mime
View raw message