hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: hbase-master-server slept
Date Fri, 08 Feb 2013 18:48:04 GMT
What zookeeper version are you using ?
Is the ensemble managed by HBase ?

Can you check the zookeeper log on 192.168.152.1<http://192.168.152.1:2181/>
 ?
Use pastebin to show us the log if necessary.

Thanks

On Fri, Feb 8, 2013 at 12:55 AM, So Hibino <hibino.so@lab.ntt.co.jp> wrote:

> Our hbase-master-server was shutdown with following message.
> Hbase is runnig in Distributed mode in a single node.
> I checked that GC completed in a very short time at the time of output the
> WARN.
> In addition the other system that is running in the same architecture
> doesn't output the following WARN messsage and works well.
> So I think that this is not due to a long GC pause.
>
> Do you have any idea about the problem?
>
> 2013-01-30 03:07:48,582 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
> 28970ms instead of 1000ms, this is likely due to a long garbage collecting
> pause and it's usually bad, see
> http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> 2013-01-30 03:07:48,583 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
> 36902ms instead of 10000ms, this is likely due to a long garbage collecting
> pause and it's usually bad, see
> http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> 2013-01-30 03:07:48,585 INFO org.apache.zookeeper.ClientCnxn: Client
> session
> timed out, have not heard from server in 39989ms for sessionid
> 0x13c84cebfce0000, closing socket connection and attempting reconnect
> 2013-01-30 03:07:48,586 INFO org.apache.zookeeper.ClientCnxn: Client
> session
> timed out, have not heard from server in 39987ms for sessionid
> 0x13c84cebfce0001, closing socket connection and attempting reconnect
> 2013-01-30 03:07:52,779 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket
> connection to server VM_11/192.168.152.1:2181
> 2013-01-30 03:07:52,789 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to VM_11/192.168.152.1:2181, initiating session
> 2013-01-30 03:07:52,777 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket
> connection to server VM_11/192.168.152.1:2181
> 2013-01-30 03:07:52,793 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to VM_11/192.168.152.1:2181, initiating session
> 2013-01-30 03:07:52,794 INFO org.apache.zookeeper.ClientCnxn: Unable to
> reconnect to ZooKeeper service, session 0x13c84cebfce0001 has expired,
> closing socket connection
> 2013-01-30 03:07:52,794 INFO
>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> This client just lost it's session with ZooKeeper, trying to reconnect.
> 2013-01-30 03:07:52,794 INFO
>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> Trying to reconnect to zookeeper.
> 2013-01-30 03:07:52,795 INFO org.apache.zookeeper.ZooKeeper: Initiating
> client connection, connectString=VM_11:2181 sessionTimeout=180000
> watcher=hconnection
> 2013-01-30 03:07:52,812 INFO org.apache.zookeeper.ClientCnxn: Unable to
> reconnect to ZooKeeper service, session 0x13c84cebfce0000 has expired,
> closing socket connection
> 2013-01-30 03:07:52,813 FATAL org.apache.hadoop.hbase.master.HMaster:
>
> master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000
>
> master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000
> received expired from ZooKeeper, aborting
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired
>         at
>
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:361)
>         at
>
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:279)
>         at
>
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:526)
>         at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> 2013-01-30 03:07:52,813 INFO org.apache.hadoop.hbase.master.HMaster:
> Aborting
> 2013-01-30 03:07:52,813 INFO org.apache.zookeeper.ClientCnxn: EventThread
> shut down
> 2013-01-30 03:07:52,813 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket
> connection to server VM_11/192.168.152.1:2181
> 2013-01-30 03:07:52,814 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to VM_11/192.168.152.1:2181, initiating session
> 2013-01-30 03:07:52,815 ERROR org.apache.hadoop.hbase.master.HMaster:
> Region
> server serverName=VM_11,60020,1359437833300, load=(requests=0, regions=3,
> usedHeap=45, maxHeap=997) reported a fatal error:
> ABORTING region server serverName=VM_11,60020,1359437833300,
> load=(requests=0, regions=3, usedHeap=45, maxHeap=997):
>
> regionserver:60020-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002
>
> regionserver:60020-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002
> received expired from ZooKeeper, aborting
> Cause:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired
>         at
>
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:361)
>         at
>
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:279)
>         at
>
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:526)
>         at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
>
> 2013-01-30 03:07:52,820 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server VM_11/192.168.152.1:2181, sessionid =
> 0x13c84cebfce0005, negotiated timeout = 40000
> 2013-01-30 03:07:52,841 INFO
>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> Reconnected successfully. This disconnect could have been caused by a
> network partition or a long-running GC pause, either way it's recommended
> that you verify your environment.
> 2013-01-30 03:07:52,841 INFO org.apache.zookeeper.ClientCnxn: EventThread
> shut down
> 2013-01-30 03:07:53,614 INFO org.apache.hadoop.hbase.master.LogCleaner:
> master-VM_11:60000.oldLogCleaner exiting
> 2013-01-30 03:07:54,251 INFO org.apache.hadoop.hbase.master.HMaster$2:
> VM_11:60000-BalancerChore exiting
> 2013-01-30 03:07:54,251 DEBUG org.apache.hadoop.hbase.master.HMaster:
> Stopping service threads
> 2013-01-30 03:07:54,251 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> server on 60000
> 2013-01-30 03:07:54,252 INFO org.apache.hadoop.hbase.master.HMaster:
> Stopping infoServer
> 2013-01-30 03:07:54,325 INFO org.mortbay.log: Stopped
> SelectChannelConnector@0.0.0.0:60010
> 2013-01-30 03:07:54,326 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 5 on 60000: exiting
> 2013-01-30 03:07:54,326 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> IPC
> Server listener on 60000
> 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 9 on 60000: exiting
> 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 8 on 60000: exiting
> 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 7 on 60000: exiting
> 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 6 on 60000: exiting
> 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 4 on 60000: exiting
> 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 3 on 60000: exiting
> 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 2 on 60000: exiting
> 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1 on 60000: exiting
> 2013-01-30 03:07:54,327 INFO org.apache.hadoop.hbase.master.CatalogJanitor:
> VM_11:60000-CatalogJanitor exiting
> 2013-01-30 03:07:54,328 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 0 on 60000: exiting
> 2013-01-30 03:07:54,328 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> IPC
> Server Responder
> 2013-01-30 03:07:54,337 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil:
>
> master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000
> Unable to get data of znode /hbase/master
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for /hbase/master
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
>         at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:577)
>         at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:554)
>         at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:648)
>         at
>
> org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:202)
>         at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:318)
> 2013-01-30 03:07:54,337 ERROR
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher:
>
> master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000
> Received unexpected KeeperException, re-throwing exception
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for /hbase/master
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
>         at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:577)
>         at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:554)
>         at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:648)
>         at
>
> org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:202)
>         at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:318)
> 2013-01-30 03:07:54,337 ERROR
> org.apache.hadoop.hbase.master.ActiveMasterManager:
>
> master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000
> Error deleting our own master address node
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for /hbase/master
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
>         at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:577)
>         at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:554)
>         at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:648)
>         at
>
> org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:202)
>         at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:318)
> 2013-01-30 03:07:54,337 DEBUG
> org.apache.hadoop.hbase.catalog.CatalogTracker: Stopping catalog tracker
> org.apache.hadoop.hbase.catalog.CatalogTracker@4743bf3d
> 2013-01-30 03:07:54,337 DEBUG
>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> The connection to hconnection-0x13c84cebfce0005 has been closed.
> 2013-01-30 03:07:54,338 INFO
>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> Closed zookeeper sessionid=0x13c84cebfce0005
> 2013-01-30 03:07:54,339 INFO org.apache.zookeeper.ZooKeeper: Session:
> 0x13c84cebfce0005 closed
> 2013-01-30 03:07:54,339 DEBUG
>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> The connection to null has been closed.
> 2013-01-30 03:07:54,339 INFO org.apache.hadoop.hbase.master.HMaster:
> HMaster
> main thread exiting
> 2013-01-30 03:07:54,339 INFO org.apache.zookeeper.ClientCnxn: EventThread
> shut down
> 2013-01-30 03:07:54,339 INFO
> org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor:
> VM_11:60000.timeoutMonitor exiting
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/hbase-master-server-slept-tp4038192.html
> Sent from the HBase User mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message