Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3FA5F1844C for ; Fri, 20 Nov 2015 00:14:11 +0000 (UTC) Received: (qmail 11695 invoked by uid 500); 20 Nov 2015 00:14:11 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 11645 invoked by uid 500); 20 Nov 2015 00:14:11 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 11610 invoked by uid 99); 20 Nov 2015 00:14:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Nov 2015 00:14:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id E7D122C1F51 for ; Fri, 20 Nov 2015 00:14:10 +0000 (UTC) Date: Fri, 20 Nov 2015 00:14:10 +0000 (UTC) From: "Mikhail Antonov (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-14838) SimpleRegionNormalizer does not merge empty region of a table MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-14838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014829#comment-15014829 ] Mikhail Antonov commented on HBASE-14838: ----------------------------------------- bq. In this contrived case, we had only written a small amount of data to one of the regions (<1MB). I've yet to investigate why the greater-than-zero amount of data in one region was ultimately treated as no data (confirmed via a remote debugger attached to the master). Because code which calculates region size in region normalizer uses metrics (ServerLoad/RegionLoad based), where region size (aggregated store file size) is represented is MB and is floored (truncated) down. If you got say 80kb worth of data, normalizer thingk its zero. That's the reason why minicluster tests for this feature are generating more than 1mb of data per region. I remember looking for some convenient method which would report exact size (like, hm, Region#size()), but haven't found anything suitable. > SimpleRegionNormalizer does not merge empty region of a table > ------------------------------------------------------------- > > Key: HBASE-14838 > URL: https://issues.apache.org/jira/browse/HBASE-14838 > Project: HBase > Issue Type: Bug > Affects Versions: 1.2.0 > Reporter: Romil Choksi > > SImpleRegionNormalizer does not merge empty region of a table > Steps to repro: > - Create an empty table with few, say 5-6 regions without any data in any of them > - Verify hbase:meta table to verify the regions for the table or check HMaster UI > - Enable normalizer switch and normalization for this table > - Run normalizer, by 'normalize' command from hbase shell > - Verify the regions for table by scanning hbase:meta table or checking HMaster web UI > The empty regions are not merged on running the region normalizer. This seems to be an edge case with completely empty regions since the Normalizer checks for: smallestRegion (in this case 0 size) + smallestNeighborOfSmallestRegion (in this case 0 size) > avg region size (in this case 0 size) > thanks to [~elserj] for verifying this from the source code side -- This message was sent by Atlassian JIRA (v6.3.4#6332)