Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1FC401783E for ; Tue, 3 Mar 2015 01:26:08 +0000 (UTC) Received: (qmail 1249 invoked by uid 500); 3 Mar 2015 01:26:07 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 1193 invoked by uid 500); 3 Mar 2015 01:26:07 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 1181 invoked by uid 99); 3 Mar 2015 01:26:07 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Mar 2015 01:26:07 +0000 Date: Tue, 3 Mar 2015 01:26:07 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344250#comment-14344250 ] stack commented on HBASE-11165: ------------------------------- In your list, IMO, 1. is reason enough to do small regions. 4. Smaller regions mean less of the keyspace is offline when balancing; also more even balance is possible when regions are smaller. bq. Can we only solve these three with many small regions? If we just did small regions, without introducing anything new (other than doing what we already do 'better'/'faster'), we could improve on your list without need to add custom compaction policy(-ies) and the recording of interstices at 100MB intervals in metadata (which we'd have to teach clients to read), etc. Regards a problem statement, you want one on why we should tend down toward small rather than continue our current trajectory of larger and larger regions, or do you want a problem statement for the subject of this JIRA? Regards this JIRA, we have users who are headed toward 1M now (Flurry reported being at 300k afraid to go up from there and Francis has 'larger' clusters) so we have to deal. You thinking we should explore going up from 10/20G toward 100G or 1TB? (With stripe compactions++ and means of apportioning out the 1TB region, etc., to address the 1-4 list above?). > Scaling so cluster can host 1M regions and beyond (50M regions?) > ---------------------------------------------------------------- > > Key: HBASE-11165 > URL: https://issues.apache.org/jira/browse/HBASE-11165 > Project: HBase > Issue Type: Brainstorming > Reporter: stack > Attachments: HBASE-11165.zip, Region Scalability test.pdf, ScalableMeta.pdf, zk_less_assignment_comparison_2.pdf > > > This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" and comments on the doc posted there. > A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later. This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end). > More detail to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)