hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "S.L" <simpleliving...@gmail.com>
Subject Re: DataNode Timeout exceptions.
Date Thu, 28 May 2015 23:31:30 GMT
Hi Ted , I have only 3 Datanodes.

When I check the logs , I see the following exception in the DataNode log
and no exceptions in the NameNode log.

Stack Trace from the DataNode log.

2015-05-27 10:52:34,741 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:34,772 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
123.32.23.234:50010, dest: /123.32.23.234:56653, bytes: 1453056, op:
HDFS_READ, cliID:
DFSClient_attempt_1431824165463_0265_m_000002_0_-805582199_1, offset: 0,
srvID: 3eb119a1-b922-4b38-9adf-35074dc88c94, blockid:
BP-1751673171-123.32.23.234-1431824104307:blk_1073750543_9719, duration:
481096638884
2015-05-27 10:52:34,772 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(123.32.23.234,
datanodeUuid=3eb119a1-b922-4b38-9adf-35074dc88c94, infoPort=50075,
ipcPort=50020,
storageInfo=lv=-51;cid=CID-f3f9b2dc-893a-45f3-8bac-54fe5d77acfc;nsid=1583960326;c=0):Got
exception while serving
BP-1751673171-123.32.23.234-1431824104307:blk_1073750543_9719 to /
123.32.23.234:56653
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:34,772 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
server1.dealyaft.com:50010:DataXceiver
error processing READ_BLOCK operation  src: /123.32.23.234:56653 dest: /
123.32.23.234:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56653]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
2015-05-27 10:52:35,890 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: exception:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/123.32.23.234:50010
remote=/123.32.23.234:56655]
at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:712)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:340)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:101)
at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)

On Tue, May 26, 2015 at 8:29 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> bq. All datanodes 112.123.123.123:50010 are bad. Aborting...
>
> How many datanodes do you have ?
>
> Can you check datanode namenode log ?
>
> Cheers
>
> On Tue, May 26, 2015 at 5:00 PM, S.L <simpleliving016@gmail.com> wrote:
>
>> Hi All,
>>
>> I am on Apache Yarn 2.3.0 and lately I have been seeing this exceptions
>> happening frequently.Can someone tell me the root cause of this issue.
>>
>> I have set the the property in mapred-site.xml as follows , is there any
>> other property that I need to set also?
>>
>>     <property>
>>       <name>mapreduce.task.timeout</name>
>>       <value>1800000</value>
>>       <description>
>>       The time out value for taks, I set this because the JVMs might be
>> busy in GC and this is causing timeout in Hadoop Tasks.
>>       </description>
>>     </property>
>>
>>
>>
>> 15/05/26 02:06:53 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor
>> exception  for block
>> BP-1751673171-112.123.123.123-1431824104307:blk_1073749395_8571
>> java.net.SocketTimeoutException: 65000 millis timeout while waiting for
>> channel to be ready for read. ch :
>> java.nio.channels.SocketChannel[connected local=/112.123.123.123:35398
>> remote=/112.123.123.123:50010]
>> at
>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
>> at
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
>> at java.io.FilterInputStream.read(FilterInputStream.java:83)
>> at java.io.FilterInputStream.read(FilterInputStream.java:83)
>> at
>> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1881)
>> at
>> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:116)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:726)
>> 15/05/26 02:06:53 INFO mapreduce.JobSubmitter: Cleaning up the staging
>> area /tmp/hadoop-yarn/staging/df/.staging/job_1431824165463_0221
>> 15/05/26 02:06:54 WARN security.UserGroupInformation:
>> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
>> All datanodes 112.123.123.123:50010 are bad. Aborting...
>> 15/05/26 02:06:54 WARN security.UserGroupInformation:
>> PriviledgedActionException as:df (auth:SIMPLE) cause:java.io.IOException:
>> All datanodes 112.123.123.123:50010 are bad. Aborting...
>> Exception in thread "main" java.io.IOException: All datanodes
>> 112.123.123.123:50010 are bad. Aborting...
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1023)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:838)
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:483)
>>
>>
>>
>>
>

Mime
View raw message