hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
Date Sun, 18 May 2014 21:19:53 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14001210#comment-14001210

Lars Hofhansl commented on HBASE-11165:

Yep. My point was that we need to tackle from a lot of different angles:
* autotuning memstore sizes, and lazy allocation (as Andy says), or sharing memstores
* make large regions more workable, splits, compations, etc
* allow more RAM to be used by region server (off heap memstores)
* fast assignments (colocating HMaster and the RS hosting META, 50m should then not be a problem
if everything is local to a single machine)
* allow smaller units of computation in M/R
* split META? And then colocate with multiple HMasters?

Right now it is hard to even utilize the disks on a reasonably sized commodity machine...
Say you can host 100 regions on a server (30gb heap). With 20gb regions the max diskspace
you can utilize per box is this 100*20gb*3 (HDFS replication factor) = 6tb... Of course the
replicas are distributed across data nodes, but the averages still hold, you need 1 box for
each 6tb of diskspace.
Larger regions and smaller memstores can fix that, but currently these lead to other issues.

(See also http://hadoop-hbase.blogspot.com/2013/01/hbase-region-server-memory-sizing.html,
where I blogged about this a while ago)

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> ----------------------------------------------------------------
>                 Key: HBASE-11165
>                 URL: https://issues.apache.org/jira/browse/HBASE-11165
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: stack
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" and comments
on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe
even 50M later.  This issue is about discussing how we will do that (or if not 50M on a cluster,
how otherwise we can attain same end).
> More detail to follow.

This message was sent by Atlassian JIRA

View raw message