hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jochen Frey <joc...@scoutlabs.com>
Subject Table disabled but all regions still online?
Date Wed, 18 Nov 2009 16:10:09 GMT
Hi!

We are evaluating hbase as a store for production data. Performance of our
cluster appears quite good, but we have had operational issues in the last
couple of days that are very concerning and I could use help with:

I have a strange situation which happened at least twice for me in the last
48 hours. We are running hbase 0.20.1 on a 9 node cluster (one master /
name, 8 data/region).

The situation is that the table is disabled as per "describe 'table-name'"

DESCRIPTION
ENABLED
 {NAME => 'dds-test', FAMILIES => [{NAME => 'data', VERSIONS => '1', CO
false
 MPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536',
IN_MEM
 ORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'meta', VERSIONS =>
'1
 ', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536',
I
 N_MEMORY => 'false', BLOCKCACHE =>
'true'}]}
1 row(s) in 0.4630 seconds

However, at the same time all there regions are still online, which I can
verify by way of the web interface as well as the command line interface (>
400 regions).

This has happened at least twice by now. The first time I was able to "fix"
it by restarting HDFS, the second time restarting didn't fix it.

The first time this happened, we had a lot going on (rolling restart of the
hbase nodes), hdfs balancer running. The second time I found the following
exception in the master log (below). Can anyone shed some light on this or
tell me what additional information would be helpful for debugging?

Thanks so much!
Jochen


2009-11-17 20:59:12,751 INFO org.apache.hadoop.hbase.master.ServerManager: 8
region servers, 0 dead, average load 50.25
2009-11-17 20:59:13,611 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.rootScanner scanning meta region {server: 10.10.0.177:60020,
regionname: -ROOT-,,0, startKey: <>}
2009-11-17 20:59:13,611 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.metaScanner scanning meta region {server: 10.10.0.189:60020,
regionname: .META.,,1, startKey: <>}
2009-11-17 20:59:13,620 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.rootScanner scan of 1 row(s) of meta region {server:
10.10.0.177:60020, regionname: -ROOT-,,0, startKey: <>} complete
2009-11-17 20:59:13,622 WARN org.apache.hadoop.hbase.master.BaseScanner:
Scan one META region: {server: 10.10.0.189:60020, regionname: .META.,,1,
startKey: <>}
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
        at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
        at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:308)
        at
org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:831)
        at
org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:712)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:328)
        at $Proxy6.openScanner(Unknown Source)
        at
org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:160)
        at
org.apache.hadoop.hbase.master.MetaScanner.scanOneMetaRegion(MetaScanner.java:73)
        at
org.apache.hadoop.hbase.master.MetaScanner.maintenanceScan(MetaScanner.java:129)
        at
org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:136)
        at org.apache.hadoop.hbase.Chore.run(Chore.java:68)
2009-11-17 20:59:13,623 INFO org.apache.hadoop.hbase.master.BaseScanner: All
1 .META. region(s) scanned
d


-- 
Jochen Frey . CTO
Scout Labs
415.366.0450
www.scoutlabs.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message