Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Tue, 3 Mar 2015 01:26:07 +0000 (UTC)
From: "stack (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12714102.1400037780000.58883.1425345967695@Atlassian.JIRA>
In-Reply-To: <JIRA.12714102.1400037780000@Atlassian.JIRA>
References: <JIRA.12714102.1400037780000@Atlassian.JIRA>
 <JIRA.12714102.1400037780201@arcas>
Subject: [jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M
 regions and beyond (50M regions?)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344250#comment-14344250 ] 

stack commented on HBASE-11165:
-------------------------------

In your list, IMO, 1. is reason enough to do small regions.

4. Smaller regions mean less of the keyspace is offline when balancing; also more even balance is possible when regions are smaller.

bq. Can we only solve these three with many small regions?

If we just did small regions, without introducing anything new (other than doing what we already do 'better'/'faster'), we could improve on your list without need to add custom compaction policy(-ies) and the recording of interstices at 100MB intervals in metadata (which we'd have to teach clients to read), etc.

Regards a problem statement, you want one on why we should tend down toward small rather than continue our current trajectory of larger and larger regions, or do you want a problem statement for the subject of this JIRA? Regards this JIRA, we have users who are headed toward 1M now (Flurry reported being at 300k afraid to go up from there and Francis has 'larger' clusters) so we have to deal.  You thinking we should explore going up from 10/20G toward 100G or 1TB? (With stripe compactions++ and means of apportioning out the 1TB region, etc., to address the 1-4 list above?).

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> ----------------------------------------------------------------
>
>                 Key: HBASE-11165
>                 URL: https://issues.apache.org/jira/browse/HBASE-11165
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: stack
>         Attachments: HBASE-11165.zip, Region Scalability test.pdf, ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" and comments on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later.  This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end).
> More detail to follow.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)