hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-3655) datenode recoverRbw could hang sometime
Date Fri, 13 Jul 2012 00:34:35 GMT
Ming Ma created HDFS-3655:
-----------------------------

             Summary: datenode recoverRbw could hang sometime
                 Key: HDFS-3655
                 URL: https://issues.apache.org/jira/browse/HDFS-3655
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: data-node
            Reporter: Ming Ma
             Fix For: 0.22.1


This bug seems to apply to 0.22 and hadoop 2.0. I will upload the initial fix done by my colleague
Xiaobo Peng shortly ( there is some logistics issue being worked on so that he can upload
patch himself later ).


recoverRbw try to kill the old writer thread, but it took the lock (FSDataset monitor object)
which the old writer thread is waiting on ( for example the call to data.getTmpInputStreams
).


"DataXceiver for client /10.110.3.43:40193 [Receiving block blk_-3037542385914640638_57111747
client=DFSClient_attempt_201206021424_0001_m_000401_0]" daemon prio=10 tid=0x00007facf8111800
nid=0x6b64 in Object.wait() [0x00007facd1ddb000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1186)

■locked <0x00000007856c1200> (a org.apache.hadoop.util.Daemon)
at java.lang.Thread.join(Thread.java:1239)
at org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:158)
at org.apache.hadoop.hdfs.server.datanode.FSDataset.recoverRbw(FSDataset.java:1347)
■locked <0x00000007838398c0> (a org.apache.hadoop.hdfs.server.datanode.FSDataset)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:119)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.opWriteBlockInternal(DataXceiver.java:391)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.opWriteBlock(DataXceiver.java:327)
at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.opWriteBlock(DataTransferProtocol.java:405)
at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.processOp(DataTransferProtocol.java:344)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:183)
at java.lang.Thread.run(Thread.java:662)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message