hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....
Date Sun, 22 Apr 2012 05:10:47 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13258998#comment-13258998
] 

stack commented on HBASE-5833:
------------------------------

More digging.  The newest test added here, testShouldCheckMasterFailOverWhenMETAIsInOpenedState,
is a little interesting.  It was added by this commit:

{code}
------------------------------------------------------------------------
r1172063 | tedyu | 2011-09-17 13:27:00 -0700 (Sat, 17 Sep 2011) | 3 lines

HBASE-4400  .META. getting stuck if RS hosting it is dead and znode state is in
               RS_ZK_REGION_OPENED (Ramkrishna)

{code}

The test is a bunch of copy/paste confirming stuff its not using.  It then does a cluster
shutdown but does it explicitly on a cluster object and not via HBaseTestingUtility though
it then starts a cluster subsequently with HBaseTestingUtility.  Not using HTU to do both
the shutodwn and the startup can make he HTU state confused on whether there a master available
so we just wait for ever.  This seems to be responsible for case where test would timeout
after 15 minutes and say no tests run and none failed.

I added a timeout for this test of 3 minutes.

Other interesting stuff is that this TestMasterFailover starts clusters per method but shutdown
leaves around some threads.  I dug in some and was able to clean up an LruBlockCache eviction
thread but others persist and would take a little more work to undo.  They seem harmless but
I'll list them anyways:

{code}
TestMasterFailover [JUnit]	
	org.eclipse.jdt.internal.junit.runner.RemoteTestRunner at localhost:54811	
		Thread [main] (Running)	
		Thread [ReaderThread] (Running)	
		Thread [Thread-2] (Suspended (breakpoint at line 587 in HBaseTestingUtility))	
			HBaseTestingUtility.shutdownMiniCluster() line: 587	
			TestMasterFailover.testSimpleMasterFailover() line: 178	
			NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native
method]	
			NativeMethodAccessorImpl.invoke(Object, Object[]) line: 39	
			DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 25	
			Method.invoke(Object, Object...) line: 597	
			FrameworkMethod$1.runReflectiveCall() line: 45	
			FrameworkMethod$1(ReflectiveCallable).run() line: 15	
			FrameworkMethod.invokeExplosively(Object, Object...) line: 42	
			InvokeMethod.evaluate() line: 20	
			FailOnTimeout$StatementThread.run() line: 62	
		Daemon Thread [Poller SunPKCS11-Darwin] (Running)	
		Thread [pool-1-thread-1] (Running)	
		Thread [pool-2-thread-1] (Running)	
		Thread [pool-3-thread-1] (Running)	
		Thread [pool-4-thread-1] (Running)	
		Daemon Thread [LeaseChecker] (Running)	
		Daemon Thread [RegionServer:2;192.168.1.74,54842,1335066804457.decayingSampleTick.1] (Running)

		Daemon Thread [Master:2;192.168.1.74,54838,1335066803952-SendThread(fe80:0:0:0:0:0:0:1%1:21818)]
(Running)	
		Daemon Thread [Master:2;192.168.1.74,54838,1335066803952-EventThread] (Running)	
		Daemon Thread [Master:1;192.168.1.74,54836,1335066798880-EventThread] (Running)	
		Daemon Thread [Master:1;192.168.1.74,54836,1335066798880-SendThread(localhost:21818)] (Running)

	/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/bin/java (Apr 21, 2012 8:53:07
PM)	
{code}

The thread names are enhanced -- v2 of this patch -- but things like decayingSampleTick are
set in a static so hard to get rid of in test setup.  The SendThread/EventThread are zk client
hangouts.  Not sure what pool-4-thread-1 are (I've enhanced the HTable executor to include
htable in name so these are identifiable going forward but above executor does not seem to
be HTable).
                
> 0.92 build has been failing pretty consistently on TestMasterFailover....
> -------------------------------------------------------------------------
>
>                 Key: HBASE-5833
>                 URL: https://issues.apache.org/jira/browse/HBASE-5833
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.92.2
>
>         Attachments: 5833.txt, closehregions.txt
>
>
> Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it local it seems
to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message