Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1718C1027F for ; Thu, 19 Dec 2013 02:24:10 +0000 (UTC) Received: (qmail 96407 invoked by uid 500); 19 Dec 2013 02:24:09 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 96377 invoked by uid 500); 19 Dec 2013 02:24:09 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 96368 invoked by uid 99); 19 Dec 2013 02:24:09 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Dec 2013 02:24:09 +0000 Date: Thu, 19 Dec 2013 02:24:09 +0000 (UTC) From: "Liu Shaohui (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-10049) Small improvments in region_mover.rb MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13852498#comment-13852498 ] Liu Shaohui commented on HBASE-10049: ------------------------------------- [~jmspaggi] {quote} What is the reason behind the 20 seconds delay? {quote} There is a time gap between RS's startup report to HMaster and it's starting of service threads. And we found some exceptions in moving regions for RS have not finished to start it's service threads. So we add 20 seconds delay to make sure the RS have enough time to finish initialization. But the 20 may be not reasonable, especially for large clusters. Ps: For large clusters, we plan to dev a region_mover which can unload/load multi regionservers at the same time {quote} Adding a log in the move method make the application VERY verbose, doubling the output. Is that really useful? {quote} Yes. I think it's very useful to measure the maximum unavailable time for each region using region_mover.rb. And many other factors and configs will affect this time, eg: hbase.hstore.open.and.close.threads.max. According to this time. we can do more optimizations to reduce the unavailable time in gracefull upgrade. I don't know if the explanation is clear. More discussions are welcomed. Thanks. > Small improvments in region_mover.rb > ------------------------------------ > > Key: HBASE-10049 > URL: https://issues.apache.org/jira/browse/HBASE-10049 > Project: HBase > Issue Type: Improvement > Reporter: Liu Shaohui > Assignee: Liu Shaohui > Priority: Minor > Fix For: 0.98.0, 0.94.15, 0.96.2 > > Attachments: HBASE-10049-0.94-v1.diff, HBASE-10049-0.94-v2.diff, HBASE-10049-trunk-v1.diff > > > We use region_mover.rb in the graceful upgrade of hbase cluster. > Here are small improvements. > a. remove the table.close(), because the htable could be reused. > b. Add more info in the log of moving region. > c. Add 20s sleep in load command to make sure the rs finished initialization of rpc server. There is a time gap between rs startup report and rpc server initialization. -- This message was sent by Atlassian JIRA (v6.1.4#6159)