hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "JoneZhang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-14074) HBase cluster crashed on-the-hour
Date Tue, 14 Jul 2015 09:00:10 GMT
JoneZhang created HBASE-14074:
---------------------------------

             Summary: HBase cluster crashed on-the-hour 
                 Key: HBASE-14074
                 URL: https://issues.apache.org/jira/browse/HBASE-14074
             Project: HBase
          Issue Type: Bug
          Components: Admin
    Affects Versions: 0.96.2
         Environment: Hadoop 2.5.1
HBase 0.96.2
            Reporter: JoneZhang


I found hbase clutser crashed on-the-hour
HBase master running log as follows

"2015-07-14 14:41:49,832 DEBUG [master:10.240.131.18:60000.oldLogCleaner] master.ReplicationLogCleaner:
Didn't find this log in ZK, deleting: 10-241-125-46%2C60020%2C1436841063572.1436851865226
2015-07-14 14:45:49,822 DEBUG [master:10.240.131.18:60000.oldLogCleaner] master.ReplicationLogCleaner:
Didn't find this log in ZK, deleting: 10-241-85-137%2C60020%2C1436841341086.1436852143141
2015-07-14 15:00:03,481 INFO  [main] util.VersionInfo: HBase 0.96.2-hadoop2
2015-07-14 15:00:03,481 INFO  [main] util.VersionInfo: Subversion https://svn.apache.org/repos/asf/hbase/tags/0.96.2RC2
-r 1581096
2015-07-14 15:00:03,481 INFO  [main] util.VersionInfo: Compiled by stack on Mon Mar 24 16:03:18
PDT 2014
2015-07-14 15:00:03,729 INFO  [main] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-1392090,
built on 09/30/2012 17:52 GMT
2015-07-14 15:00:03,730 INFO  [main] zookeeper.ZooKeeper: Client environment:host.name=10-240-131-18
2015-07-14 15:00:03,730 INFO  [main] zookeeper.ZooKeeper: Client environment:java.version=1.7.0_72

...

2015-07-14 15:00:03,749 INFO  [main] zookeeper.RecoverableZooKeeper: Process identifier=clean
znode for master connecting to ZooKeeper ensemble=10.240.131.17:2200,10.240.131.16:2200,10.240.131.15:2200,10.240.131.14:2200,10.240.131.18:2200
2015-07-14 15:00:03,751 INFO  [main-SendThread(10-240-131-18:2200)] zookeeper.ClientCnxn:
Opening socket connection to server 10-240-131-18/10.240.131.18:2200. Will not attempt to
authenticate using SASL (unknown error)
2015-07-14 15:00:03,757 INFO  [main-SendThread(10-240-131-18:2200)] zookeeper.ClientCnxn:
Socket connection established to 10-240-131-18/10.240.131.18:2200, initiating session
2015-07-14 15:00:03,764 INFO  [main-SendThread(10-240-131-18:2200)] zookeeper.ClientCnxn:
Session establishment complete on server 10-240-131-18/10.240.131.18:2200, sessionid = 0x34e8a64b453024a,
negotiated timeout = 40000
2015-07-14 15:00:04,835 INFO  [main] zookeeper.ZooKeeper: Session: 0x34e8a64b453024a closed
2015-07-14 15:00:04,835 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down"


After print " Didn't find this log in ZK..." every hour at a time
The master dead


Zookeeper  running log as follows

"2015-07-14 15:00:03,756 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2200:NIOServerCnxnFactory@197]
- Accepted socket connection from /10.240.131.18:52733
2015-07-14 15:00:03,761 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2200:ZooKeeperServer@868]
- Client attempting to establish new session at /10.240.131.18:52733
2015-07-14 15:00:03,762 [myid:3] - INFO  [CommitProcessor:3:ZooKeeperServer@617] - Established
session 0x34e8a64b453024a with negotiated timeout 40000 for client /10.240.131.18:52733
2015-07-14 15:00:04,836 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2200:NIOServerCnxn@1007]
- Closed socket connection for client /10.240.131.18:52733 which had sessionid 0x34e8a64b453024a"




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message