hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Francis Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
Date Tue, 19 Aug 2014 18:01:26 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102555#comment-14102555

Francis Liu commented on HBASE-11165:

Can I have some pointers on how to read the above. Zk-less AM is better because you scan a
table – you don't have to ls znodes? What is the 1M znodes vs 1M rows about in above?
Essentially the apis are better. ie 1M rows we can iterate over the rows instead of ls and
get back a huge chunk of data. ie deleting 1M znodes takes too long, this could be parallelizable
against an hbase table.

For 2.a, response is below. For 2.b, it's mainly a concern wether we'll hit other ZK issues
when having that many child znodes (1M and beyond). HDFS guys are already looking into scaling
number of child directories for NN.

Will update doc.

Francis Liu Is the above the basis for your "...As our experiments shows splitting is a must
for scaling."? If split meta, then more read/write throughput? 
If split meta, then:  1) Less write amplification (ie no large compactions), Better W throughput.
2) More disks, more R/W throughput. 3. More heap to fit meta, better R throughput.

Because the meta table could be served by many machines so field more reads/writes? The reads/writes
are needed at starttime or during cluster lifetime in your judgement? Thanks.
Yep needed for startup. We need to do experiments for 1 rack and 2 rack failure for cluster
lifetime case. Though large compactions would creep up on you. So splitting would still be
motivating for cluster lifetime IMHO.  

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> ----------------------------------------------------------------
>                 Key: HBASE-11165
>                 URL: https://issues.apache.org/jira/browse/HBASE-11165
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: stack
>         Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" and comments
on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe
even 50M later.  This issue is about discussing how we will do that (or if not 50M on a cluster,
how otherwise we can attain same end).
> More detail to follow.

This message was sent by Atlassian JIRA

View raw message