Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 73BC6C537 for ; Sat, 8 Jun 2013 21:54:20 +0000 (UTC) Received: (qmail 41789 invoked by uid 500); 8 Jun 2013 21:54:20 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 41759 invoked by uid 500); 8 Jun 2013 21:54:20 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 41750 invoked by uid 99); 8 Jun 2013 21:54:20 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 08 Jun 2013 21:54:20 +0000 Date: Sat, 8 Jun 2013 21:54:20 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-8716) Fixups/Improvements for graceful_stop.sh/region_mover.rb MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13678866#comment-13678866 ] stack commented on HBASE-8716: ------------------------------ I tried these changes on cluster and seems to do right thing. Here is before the change: {code} [stack@sss-1 ~]$ ./hbase/bin/graceful_stop.sh --config /home/stack/conf-hbase x 2013-06-08T14:22:02 Disabling load balancer 2013-06-08T14:22:09 Previous balancer state was false 2013-06-08T14:22:09 Unloading x region(s) 2013-06-08 14:22:14,867 TRACE [main] zookeeper.ZKConfig: Skipped reading ZK properties file 'zoo.cfg' since 'hbase.config.read.zookeeper.config' was not set to true 2013-06-08 14:22:14,907 INFO [main] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT 2013-06-08 14:22:14,907 INFO [main] zookeeper.ZooKeeper: Client environment:host.name=sss-1.ent.cloudera.com 2013-06-08 14:22:14,907 INFO [main] zookeeper.ZooKeeper: Client environment:java.version=1.6.0_31 2013-06-08 14:22:14,907 INFO [main] zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc. 2013-06-08 14:22:14,907 INFO [main] zookeeper.ZooKeeper: Client environment:java.home=/usr/java/jdk1.6.0_31/jre .... 2013-06-08 14:22:14,990 INFO [main-SendThread(sss-1.ent.cloudera.com:2181)] zookeeper.ClientCnxn: Socket connection established to sss-1.ent.cloudera.com/10.20.195.21:2181, initiating session 2013-06-08 14:22:15,049 INFO [main-SendThread(sss-1.ent.cloudera.com:2181)] zookeeper.ClientCnxn: Session establishment complete on server sss-1.ent.cloudera.com/10.20.195.21:2181, sessionid = 0x13ef746f91a0054, negotiated timeout = 90000 RuntimeError: Server x not online stripServer at /home/stack/hbase/bin/region_mover.rb:200 unloadRegions at /home/stack/hbase/bin/region_mover.rb:306 (root) at /home/stack/hbase/bin/region_mover.rb:456 2013-06-08T14:22:16 Unloaded x region(s) 2013-06-08T14:22:16 Stopping regionserver x: ssh: Could not resolve hostname x: Name or service not known [stack@sss-1 ~]$ echo $? 0 {code} Here is after the change passing -e: {code} [stack@sss-1 ~]$ ./hbase/bin/graceful_stop.sh --config /home/stack/conf-hbase -e x 2013-06-08T14:24:10 Disabling load balancer 2013-06-08T14:24:17 Previous balancer state was false 2013-06-08T14:24:17 Unloading x region(s) 2013-06-08 14:24:22,883 TRACE [main] zookeeper.ZKConfig: Skipped reading ZK properties file 'zoo.cfg' since 'hbase.config.read.zookeeper.config' was not set to true 2013-06-08 14:24:22,920 INFO [main] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT 2013-06-08 14:24:22,920 INFO [main] zookeeper.ZooKeeper: Client environment:host.name=sss-1.ent.cloudera.com 2013-06-08 14:24:22,920 INFO [main] zookeeper.ZooKeeper: Client environment:java.version=1.6.0_31 ... 2013-06-08 14:24:22,949 INFO [main] zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x24eff2c connecting to ZooKeeper ensemble=sss-1.ent.cloudera.com:2181 2013-06-08 14:24:22,964 INFO [main-SendThread(sss-1.ent.cloudera.com:2181)] zookeeper.ClientCnxn: Opening socket connection to server sss-1.ent.cloudera.com/10.20.195.21:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration) 2013-06-08 14:24:22,974 INFO [main-SendThread(sss-1.ent.cloudera.com:2181)] zookeeper.ClientCnxn: Socket connection established to sss-1.ent.cloudera.com/10.20.195.21:2181, initiating session 2013-06-08 14:24:23,020 INFO [main-SendThread(sss-1.ent.cloudera.com:2181)] zookeeper.ClientCnxn: Session establishment complete on server sss-1.ent.cloudera.com/10.20.195.21:2181, sessionid = 0x13ef746f91a0057, negotiated timeout = 90000 RuntimeError: Server x not online stripServer at /home/stack/hbase/bin/region_mover.rb:200 unloadRegions at /home/stack/hbase/bin/region_mover.rb:306 (root) at /home/stack/hbase/bin/region_mover.rb:456 [stack@sss-1 ~]$ echo $? 1 {code} > Fixups/Improvements for graceful_stop.sh/region_mover.rb > -------------------------------------------------------- > > Key: HBASE-8716 > URL: https://issues.apache.org/jira/browse/HBASE-8716 > Project: HBase > Issue Type: Improvement > Reporter: stack > Assignee: stack > Attachments: 8716.txt > > > It is a while since these scripts were touched. Giving them a spring cleaning and seeing if can make them return error codes on failure (seems like style previous was that the operator would watch the output and react to it but I see cases where tools want to call these scripts and they want return code to indicate whether the rolling upgrade worked or not). Also, see if can make the rolling restart faster since one-by-one while minimally disruptive and 'safe', it is slow one clusters of hundreds of nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira