hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-421) TestRegionServerExit broken
Date Fri, 08 Feb 2008 03:09:07 GMT

     [ https://issues.apache.org/jira/browse/HBASE-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jim Kellerman updated HBASE-421:
--------------------------------

    Attachment: patch.txt

Increase memory for tests, 256M may be too little

> TestRegionServerExit broken
> ---------------------------
>
>                 Key: HBASE-421
>                 URL: https://issues.apache.org/jira/browse/HBASE-421
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 0.2.0
>            Reporter: Jim Kellerman
>            Assignee: Jim Kellerman
>            Priority: Critical
>         Attachments: patch.txt, patch.txt
>
>
> TestRegionServerExit has a couple of problems:
> 1. Region server tries to start http server on a port already in use:
>     [junit] 2008-02-07 07:01:06,529 FATAL [RegionServer:2] hbase.HRegionServer(867):
Failed init
>     [junit] java.io.IOException: Problem starting http server
>     [junit] 	at org.apache.hadoop.hbase.util.InfoServer.start(InfoServer.java:227)
>     [junit] 	at org.apache.hadoop.hbase.HRegionServer.startServiceThreads(HRegionServer.java:928)
>     [junit] 	at org.apache.hadoop.hbase.HRegionServer.init(HRegionServer.java:863)
>     [junit] 	at org.apache.hadoop.hbase.HRegionServer.run(HRegionServer.java:633)
>     [junit] 	at java.lang.Thread.run(Thread.java:595)
>     [junit] Caused by: org.mortbay.util.MultiException[java.net.BindException: Address
already in use]
>     [junit] 	at org.mortbay.http.HttpServer.doStart(HttpServer.java:731)
>     [junit] 	at org.mortbay.util.Container.start(Container.java:72)
>     [junit] 	at org.apache.hadoop.hbase.util.InfoServer.start(InfoServer.java:205)
>     [junit] 	... 4 more
>     [junit] 2008-02-07 07:01:06,530 FATAL [RegionServer:2] hbase.HRegionServer(772):
Unhandled exception. Aborting...
> The region server that died apparently was serving the root region.
> The test case apparently has a long timeout for finding the root region because you see
a lot of 
>     [junit] 2008-02-07 07:04:14,813 DEBUG [Thread-540] hbase.HConnectionManager$TableServers(708):
Wake. Retry finding root region.
>     [junit] 2008-02-07 07:04:14,814 DEBUG [Thread-540] hbase.HConnectionManager$TableServers(704):
Sleeping. Waiting for root region.
>     [junit] 2008-02-07 07:04:24,823 DEBUG [Thread-540] hbase.HConnectionManager$TableServers(708):
Wake. Retry finding root region.
>     [junit] 2008-02-07 07:04:24,827 DEBUG [Thread-540] hbase.HConnectionManager$TableServers(704):
Sleeping. Waiting for root region.
>     [junit] 2008-02-07 07:04:34,833 DEBUG [Thread-540] hbase.HConnectionManager$TableServers(708):
Wake. Retry finding root region.
>     [junit] 2008-02-07 07:04:34,836 DEBUG [Thread-540] hbase.HConnectionManager$TableServers(704):
Sleeping. Waiting for root region.
>     [junit] 2008-02-07 07:04:44,842 DEBUG [Thread-540] hbase.HConnectionManager$TableServers(708):
Wake. Retry finding root region.
> until finally the client gives up:
>     [junit] 2008-02-07 07:04:44,843 FATAL [Thread-540] hbase.TestRegionServerExit$1(161):
could not re-open meta table because
>     [junit] org.apache.hadoop.hbase.NoServerForRegionException: Timed out trying to locate
root region
>     [junit] 	at org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:718)
>     [junit] 	at org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:329)
>     [junit] 	at org.apache.hadoop.hbase.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:311)
>     [junit] 	at org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:476)
>     [junit] 	at org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:339)
>     [junit] 	at org.apache.hadoop.hbase.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:311)
>     [junit] 	at org.apache.hadoop.hbase.HTable.getRegionLocation(HTable.java:114)
>     [junit] 	at org.apache.hadoop.hbase.HTable$ClientScanner.nextScanner(HTable.java:889)
>     [junit] 	at org.apache.hadoop.hbase.HTable$ClientScanner.<init>(HTable.java:817)
>     [junit] 	at org.apache.hadoop.hbase.HTable.obtainScanner(HTable.java:522)
>     [junit] 	at org.apache.hadoop.hbase.HTable.obtainScanner(HTable.java:411)
>     [junit] 	at org.apache.hadoop.hbase.TestRegionServerExit$1.run(TestRegionServerExit.java:156)
>     [junit] 	at java.lang.Thread.run(Thread.java:595)
>     [junit] Exception in thread "Thread-540" junit.framework.AssertionFailedError
>     [junit] 	at junit.framework.Assert.fail(Assert.java:47)
>     [junit] 	at junit.framework.Assert.fail(Assert.java:53)
>     [junit] 	at org.apache.hadoop.hbase.TestRegionServerExit$1.run(TestRegionServerExit.java:162)
>     [junit] 	at java.lang.Thread.run(Thread.java:595)
> Which is not the way the test is supposed to run at all.
> It appears that when we start multiple region servers in a MiniHBaseCluster, they all
try to start their http server on the same port. In the past I believe that the http server
start failure was not fatal, so the test ran.
> We should either have some kind of setting for MiniHBaseCluster that tells the master
and region servers not to start their http servers, or some way of telling multiple servers
not to start on the same port, or making http startup failure non-fatal.
> Tests like these are good as they (eventually) point out a regression to us.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message