hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8939) Hanging unit tests
Date Thu, 18 Jul 2013 05:32:48 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13712027#comment-13712027
] 

stack commented on HBASE-8939:
------------------------------

I added to apache builds a post build task that runs our zombie tracker from ./dev-tools/test-patch.sh.
 It caught one just now:

https://builds.apache.org/job/HBase-TRUNK/4265/console

TestLogRollAbort won't shutdown.  It is a bit of a strange test in that it kills hdfs out
from under us and tries to ensure we don't lose edits.  We are stuck on a thread join.  It
looks like it has a timer of two minutes but oddly the test claims to have 'passed' early
enough in the game:

Running org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 147.206 sec

Here is where we are bound up.

{code}"pool-1-thread-1" prio=10 tid=0x7614b400 nid=0x62da in Object.wait() [0x7774f000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x7fd2d268> (a org.apache.hadoop.hbase.util.JVMClusterUtil$RegionServerThread)
	at java.lang.Thread.join(Thread.java:1186)
	- locked <0x7fd2d268> (a org.apache.hadoop.hbase.util.JVMClusterUtil$RegionServerThread)
	at java.lang.Thread.join(Thread.java:1239)
	at org.apache.hadoop.hbase.util.JVMClusterUtil.shutdown(JVMClusterUtil.java:242)
	at org.apache.hadoop.hbase.LocalHBaseCluster.shutdown(LocalHBaseCluster.java:427)
	at org.apache.hadoop.hbase.MiniHBaseCluster.shutdown(MiniHBaseCluster.java:495)
	at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:742)
	at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:711)
	at org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort.tearDown(TestLogRollAbort.java:114)

{code}

But we are also stuck here in setup:

{code}
"LeaseChecker@DFSClient[clientName=DFSClient_1663452662, ugi=jenkins]: java.lang.Throwable:
for testing
	at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.toString(DFSClient.java:1393)
	at org.apache.hadoop.util.Daemon.<init>(Daemon.java:38)
	at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.put(DFSClient.java:1306)
	at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:716)
	at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:182)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:555)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:536)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:443)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:435)
	at org.apache.hadoop.hbase.util.FSUtils.setVersion(FSUtils.java:476)
	at org.apache.hadoop.hbase.util.FSUtils.setVersion(FSUtils.java:361)
	at org.apache.hadoop.hbase.HBaseTestingUtility.createRootDir(HBaseTestingUtility.java:773)
	at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:645)
	at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:627)
	at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:575)
	at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:562)
	at org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort.setUp(TestLogRollAbort.java:102)

{code}

We are doing setup and shutdown when thread dumped.

I'm going to disable this test for now so we get clean builds.
                
> Hanging unit tests
> ------------------
>
>                 Key: HBASE-8939
>                 URL: https://issues.apache.org/jira/browse/HBASE-8939
>             Project: HBase
>          Issue Type: Bug
>          Components: test
>            Reporter: stack
>             Fix For: 0.95.2
>
>         Attachments: 8939.txt
>
>
> We have hanging tests.  Here's a few from this morning's review:
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh  https://builds.apache.org/job/hbase-0.95-on-hadoop2/176/consoleText
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> 100 3300k    0 3300k    0     0   508k      0 --:--:--  0:00:06 --:--:--  621k
> Hanging test: Running org.apache.hadoop.hbase.TestIOFencing
> Hanging test: Running org.apache.hadoop.hbase.regionserver.wal.TestLogRolling
> {code}
> And...
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh http://54.241.6.143/job/HBase-TRUNK-Hadoop-2/396/consoleText
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> 100  779k    0  779k    0     0   538k      0 --:--:--  0:00:01 --:--:--  559k
> Hanging test: Running org.apache.hadoop.hbase.TestIOFencing
> Hanging test: Running org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
> Hanging test: Running org.apache.hadoop.hbase.client.TestFromClientSide3
> {code}
> and....
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh  http://54.241.6.143/job/HBase-0.95/607/consoleText
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> 100  445k    0  445k    0     0   490k      0 --:--:-- --:--:-- --:--:--  522k
> Hanging test: Running org.apache.hadoop.hbase.replication.TestReplicationDisableInactivePeer
> Hanging test: Running org.apache.hadoop.hbase.master.TestAssignmentManager
> Hanging test: Running org.apache.hadoop.hbase.util.TestHBaseFsck
> Hanging test: Running org.apache.hadoop.hbase.regionserver.TestStoreFileBlockCacheSummary
> Hanging test: Running org.apache.hadoop.hbase.IntegrationTestDataIngestSlowDeterministic
> {code}
> and...
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh  http://54.241.6.143/job/HBase-0.95-Hadoop-2/607/consoleText
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> 100  781k    0  781k    0     0   240k      0 --:--:--  0:00:03 --:--:--  244k
> Hanging test: Running org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint
> Hanging test: Running org.apache.hadoop.hbase.client.TestFromClientSide
> Hanging test: Running org.apache.hadoop.hbase.TestIOFencing
> Hanging test: Running org.apache.hadoop.hbase.master.TestMasterFailoverBalancerPersistence
> Hanging test: Running org.apache.hadoop.hbase.master.TestDistributedLogSplitting
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message