hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4328) TestLargeBlock#testLargeBlockSize is timing out
Date Thu, 10 Jan 2013 18:40:14 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549886#comment-13549886
] 

Chris Nauroth commented on HDFS-4328:
-------------------------------------

Thread dumps show the test hanging when {{DataBlockScanner#shutdown}} tries to join with the
{{blockScannerThread}}:

{noformat}
"main" prio=5 tid=7fd86d800800 nid=0x10efc1000 in Object.wait() [10efbe000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at java.lang.Thread.join(Thread.java:1210)
	- locked <7c3965cd8> (a java.lang.Thread)
	at java.lang.Thread.join(Thread.java:1263)
	at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.shutdown(DataBlockScanner.java:251)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownDataBlockScanner(DataNode.java:490)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownPeriodicScanners(DataNode.java:462)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1104)
        ...
{noformat}

Meanwhile in the {{blockScannerThread}}, it's stuck in an infinite wait loop in {{DataTransferThrottler#throttle}}:

{noformat}
"Thread-60" daemon prio=5 tid=7fd86c1a6800 nid=0x11c378000 in Object.wait() [11c377000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at org.apache.hadoop.hdfs.util.DataTransferThrottler.throttle(DataTransferThrottler.java:98)
	- locked <7c3c841a0> (a org.apache.hadoop.hdfs.util.DataTransferThrottler)
	at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:526)
	at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:653)
	at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyBlock(BlockPoolSliceScanner.java:397)
	at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyFirstBlock(BlockPoolSliceScanner.java:476)
	at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scan(BlockPoolSliceScanner.java:633)
	at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scanBlockPoolSlice(BlockPoolSliceScanner.java:599)
	at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:101)
        ...
{noformat}

It's likely that this infinite loop problem existed before the HDFS-4274 patch, but {{blockScannerThread}}
was a daemon thread, so it didn't block datanode shutdown.  With the HDFS-4274 patch, datanode
shutdown now joins to this thread and waits for it to finish, causing it to block datanode
shutdown.

I need to keep investigating why {{DataTransferThrottler#throttle}} is stuck in an infinite
wait loop.

                
> TestLargeBlock#testLargeBlockSize is timing out
> -----------------------------------------------
>
>                 Key: HDFS-4328
>                 URL: https://issues.apache.org/jira/browse/HDFS-4328
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 3.0.0
>            Reporter: Jason Lowe
>
> For some time now TestLargeBlock#testLargeBlockSize has been timing out on trunk.  It
is getting hung up during cluster shutdown, and after 15 minutes surefire kills it and causes
the build to fail since it exited uncleanly.
> In addition to fixing the hang, we should consider adding a timeout parameter to the
@Test decorator for this test.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message