hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthik Ranganathan <kranganat...@facebook.com>
Subject Cannot locate root region
Date Thu, 28 Jan 2010 23:57:39 GMT
Hey guys,

Ran into some issues while testing and wanted to understand what has happened better. Got
the following exception when I went to the web UI

Trying to contact region server 10.129.68.204:60020 for region .META.,,1, row '', but failed
after 3 attempts.
Exceptions:
org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException:
.META.,,1
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2254)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1837)
        at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)


>From a program that reads from a HBase table:
java.lang.reflect.UndeclaredThrowableException
        at $Proxy1.getRegionInfo(Unknown Source)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:985)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:625)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:601)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:675)
<snip>


Followed  up on the hmaster's log:

2010-01-28 11:21:16,148 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner
scan of 1 row(s) of meta region {server: 10.129.68.204:60020, regionname: .META.,,1, startKey:
<>} complete
2010-01-28 11:21:16,148 INFO org.apache.hadoop.hbase.master.BaseScanner: All 1 .META. region(s)
scanned
2010-01-28 11:21:34,539 DEBUG org.apache.hadoop.hbase.master.ServerManager: Received report
from unknown server -- telling it to MSG_CALL_SERVER_STARTUP: 10.129.68.203,60020,1263605543210
2010-01-28 11:21:35,622 INFO org.apache.hadoop.hbase.master.ServerManager: Received start
message from: hbasetest004.ash1.facebook.com,60020,1264706494600
2010-01-28 11:21:36,649 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Updated
ZNode /hbase/rs/1264706494600 with data 10.129.68.203:60020
2010-01-28 11:21:40,704 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 39 on 60000,
call createTable({NAME => 'test1', FAMILIES => [{NAME => 'cf1', VERSIONS => '3',
COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY =>
'false', BLOCKCACHE => 'true'}]}) from 10.131.29.183:63308: error: org.apache.hadoop.hbase.TableExistsException:
test1
org.apache.hadoop.hbase.TableExistsException: test1
        at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:792)
        at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:756)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)

>From a hregionserver's logs:

2010-01-28 11:20:22,589 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats:
Sizes: Total=19.661453MB (20616528), Free=2377.0137MB (2492479408), Max=2396.675MB (2513095936),
Counts: Blocks=0, Access=0, Hit=0, Miss=0, Evictions=0, Evicted=0, Ratios: Hit Ratio=NaN%,
Miss Ratio=NaN%, Evicted/Run=NaN
2010-01-28 11:21:22,588 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats:
Sizes: Total=19.661453MB (20616528), Free=2377.0137MB (2492479408), Max=2396.675MB (2513095936),
Counts: Blocks=0, Access=0, Hit=0, Miss=0, Evictions=0, Evicted=0, Ratios: Hit Ratio=NaN%,
Miss Ratio=NaN%, Evicted/Run=NaN
2010-01-28 11:22:18,794 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_CALL_SERVER_STARTUP


The code says the following:
              case MSG_CALL_SERVER_STARTUP:
                // We the MSG_CALL_SERVER_STARTUP on startup but we can also
                // get it when the master is panicking because for instance
                // the HDFS has been yanked out from under it.  Be wary of
                // this message.

Any ideas on what is going on? The best I can come up with is perhaps a flaky DNS - would
that explain this? This happened on three of our test clusters at almost the same time. Also,
what is the most graceful/simplest way to recover from this?


Thanks
Karthik


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message