hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans (JIRA)" <j...@apache.org>
Subject [jira] Created: (HBASE-2998) rolling-restart.sh shouldn't rely on zoo.cfg
Date Tue, 14 Sep 2010 21:54:45 GMT
rolling-restart.sh shouldn't rely on zoo.cfg
--------------------------------------------

                 Key: HBASE-2998
                 URL: https://issues.apache.org/jira/browse/HBASE-2998
             Project: HBase
          Issue Type: Bug
            Reporter: Jean-Daniel Cryans
             Fix For: 0.90.0


I tried the rolling-restart script on our dev environment, which is configured with zoo.cfg
for zookeeper, and it worked pretty well. Then I tried it on our MR cluster, which doesn't
have a zoo.cfg, and we suffered some downtime (no biggie tho, nothing critical was running).
When the script calls this line:

{code}
bin/hbase zkcli stat $zmaster
{code}

It directly runs a ZooKeeperMain which isn't modified to read from the HBase configuration
files. What happens next if ZK isn't running on the master node is that it receives a ConnectionRefused,
ignores it, procedes to restart the master (which waits on the znode), and the starts restarting
the region servers. They can't shutdown properly under 60 seconds, since they need a master,
so they get killed. What follows is pretty ugly and pretty much requires a whole restart.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message