hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tsuna <tsuna...@gmail.com>
Subject HBase failing to restart in single-user mode
Date Mon, 18 May 2015 03:55:53 GMT
Hi all,
For testing on my laptop (OSX with JDK 1.7.0_45) I usually build the
latest version from branch-1.0 and use the following config:

<configuration>
<property>
  <name>hbase.rootdir</name>
  <value>file:///tmp/hbase-${user.name}</value>
</property>
<property>
  <name>hbase.online.schema.update.enable</name>
  <value>true</value>
</property>
<property>
  <name>zookeeper.session.timeout</name>
  <value>300000</value>
</property>
<property>
  <name>hbase.zookeeper.property.tickTime</name>
  <value>2000000</value>
</property>
  <property>
    <name>hbase.zookeeper.dns.interface</name>
    <value>lo0</value>
  </property>
  <property>
    <name>hbase.regionserver.dns.interface</name>
    <value>lo0</value>
  </property>
  <property>
    <name>hbase.master.dns.interface</name>
    <value>lo0</value>
  </property>
</configuration>

Since at least a month ago (perhaps longer, I don’t remember exactly)
I can’t restart HBase.  The very first time it starts up fine, but
subsequent startup attempts all fail with:

2015-05-17 20:39:19,024 INFO  [RpcServer.responder] ipc.RpcServer:
RpcServer.responder: starting
2015-05-17 20:39:19,024 INFO  [RpcServer.listener,port=49809]
ipc.RpcServer: RpcServer.listener,port=49809: starting
2015-05-17 20:39:19,029 INFO  [main] http.HttpRequestLog: Http request
log for http.requests.regionserver is not defined
2015-05-17 20:39:19,030 INFO  [main] http.HttpServer: Added global
filter 'safety'
(class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter)
2015-05-17 20:39:19,031 INFO  [main] http.HttpServer: Added filter
static_user_filter
(class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter)
to context regionserver
2015-05-17 20:39:19,031 INFO  [main] http.HttpServer: Added filter
static_user_filter
(class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter)
to context static
2015-05-17 20:39:19,031 INFO  [main] http.HttpServer: Added filter
static_user_filter
(class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter)
to context logs
2015-05-17 20:39:19,033 INFO  [main] http.HttpServer: Jetty bound to port 49811
2015-05-17 20:39:19,033 INFO  [main] mortbay.log: jetty-6.1.26
2015-05-17 20:39:19,157 INFO  [main] mortbay.log: Started
SelectChannelConnector@0.0.0.0:49811
2015-05-17 20:39:19,222 INFO  [M:0;localhost:49807]
zookeeper.RecoverableZooKeeper: Process
identifier=hconnection-0x4f708099 connecting to ZooKeeper
ensemble=localhost:2181
2015-05-17 20:39:19,222 INFO  [M:0;localhost:49807]
zookeeper.ZooKeeper: Initiating client connection,
connectString=localhost:2181 sessionTimeout=10000
watcher=hconnection-0x4f7080990x0, quorum=localhost:2181,
baseZNode=/hbase
2015-05-17 20:39:19,223 INFO
[M:0;localhost:49807-SendThread(localhost:2181)] zookeeper.ClientCnxn:
Opening socket connection to server localhost/127.0.0.1:2181. Will not
attempt to authenticate using SASL (unknown error)
2015-05-17 20:39:19,223 INFO
[M:0;localhost:49807-SendThread(localhost:2181)] zookeeper.ClientCnxn:
Socket connection established to localhost/127.0.0.1:2181, initiating
session
2015-05-17 20:39:19,223 INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
server.NIOServerCnxnFactory: Accepted socket connection from
/127.0.0.1:49812
2015-05-17 20:39:19,223 INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.ZooKeeperServer:
Client attempting to establish new session at /127.0.0.1:49812
2015-05-17 20:39:19,224 INFO  [SyncThread:0] server.ZooKeeperServer:
Established session 0x14d651aaec00002 with negotiated timeout 4000000
for client /127.0.0.1:49812
2015-05-17 20:39:19,224 INFO
[M:0;localhost:49807-SendThread(localhost:2181)] zookeeper.ClientCnxn:
Session establishment complete on server localhost/127.0.0.1:2181,
sessionid = 0x14d651aaec00002, negotiated timeout = 4000000
2015-05-17 20:39:19,249 INFO  [M:0;localhost:49807]
regionserver.HRegionServer: ClusterId :
6ad7eddd-2886-4ff0-b377-a2ff42c8632f
2015-05-17 20:39:49,208 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: Master not active after 30 seconds
        at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:194)
        at org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:445)
        at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:197)
        at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2002)


I noticed that this has something to do with the ZooKeeper data.  If I
rm -rf $TMPDIR/hbase-tsuna/zookeeper then I can start HBase again.
But of course HBase won’t work properly because while some tables
exist on the filesystem, they no longer exist in ZK, etc.

Does anybody know what could be left behind in ZK that could make it
hang during startup?  I looked at a jstack output while it was paused
during 30s and didn’t find anything noteworthy.

-- 
Benoit "tsuna" Sigoure

Mime
View raw message