hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jeff whiting <je...@qualtrics.com>
Subject Lots of Different Kind of Datanode Errors
Date Fri, 04 Jun 2010 15:56:26 GMT
I had my HRegionServers go down due to hdfs exception.  In the datanode logs I'm seeing a lot
of different and varied exceptions.  I've increased the data xceiver count now but these other
ones don't make a lot of sense.  

Among them are:

:2010-06-04 07:41:56,917 ERROR datanode.DataNode (DataXceiver.java:run(131)) - DatanodeRegistration(192.168.1.184:50010,
storageID=DS-1601700079-192.168.1.184-50010-1274208308658, infoPort=50075, ipcPort=50020):DataXceiver
-java.io.EOFException
-	at java.io.DataInputStream.readByte(DataInputStream.java:250)
-	at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
-	at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
-	at org.apache.hadoop.io.Text.readString(Text.java:400)
-	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:313)
-	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
-	at java.lang.Thread.run(Thread.java:619)


:2010-06-04 08:49:56,389 ERROR datanode.DataNode (DataXceiver.java:run(131)) - DatanodeRegistration(192.168.1.184:50010,
storageID=DS-1601700079-192.168.1.184-50010-1274208308658, infoPort=50075, ipcPort=50020):DataXceiver
-java.io.IOException: Connection reset by peer
-	at sun.nio.ch.FileDispatcher.read0(Native Method)
-	at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
-	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
-	at sun.nio.ch.IOUtil.read(IOUtil.java:206)
-	at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
-	at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
-	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
-	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)


:2010-06-04 05:36:54,840 ERROR datanode.DataNode (DataXceiver.java:run(131)) - DatanodeRegistration(192.168.1.184:50010,
storageID=DS-1601700079-192.168.1.184-50010-1274208308658, infoPort=50075, ipcPort=50020):DataXceiver
-java.io.IOException: xceiverCount 2049 exceeds the limit of concurrent xcievers 2047
-	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:88)
-	at java.lang.Thread.run(Thread.java:619)

:2010-06-04 05:36:48,848 ERROR datanode.DataNode (DataXceiver.java:run(131)) - DatanodeRegistration(192.168.1.184:50010,
storageID=DS-1601700079-192.168.1.184-50010-1274208308658, infoPort=50075, ipcPort=50020):DataXceiver
-java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready
for write. ch : java.nio.channels.SocketChannel[connected local=/192.168.1.184:50010 remote=/192.168.1.184:55349]
-	at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
-	at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
-	at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
-	at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:313)
-	at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:400)
-	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:180)
-	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:95)
-	at java.lang.Thread.run(Thread.java:619)
--

The EOFException is the most common one I get.  I'm also unsure how I would get a connection
reset by peer when I'm connecting locally.  Why is the file prematurely ending? Any idea of
what is going on?

Thanks,
~Jeff

--
Jeff Whiting
Qualtrics Senior Software Engineer
jeffw@qualtrics.com






Mime
View raw message