hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Antonov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
Date Wed, 03 Sep 2014 20:06:53 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120349#comment-14120349
] 

Mikhail Antonov commented on HBASE-11165:
-----------------------------------------

[~stack],  [~octo47] - on this compaction topic,  I also mentioned early on in the thread:

bq. I wonder if it makes sense to have google doc linked to this jira to save various proposals,
findings and estimates? Like that summarizes current usage to be conservatively 3.5Gb in meta
/ 1M regions.

So seems like we're using 3-3.5 Kb per region-row? That should be compressible, looking at
the data in meta rows. Also I think it would help if we can post here some numbers and capture
in the documents, so we have the baseline for our work. For example:

 - how many kb in memory per-region in meta
 - how many hdfs inodes per region (depends on numbers of store files, but some estimate?)

To estimate, how big would be a deployment where meta doesn't fit in memory? How many RSs,
how many petabytes of data?

> Scaling so cluster can host 1M regions and beyond (50M regions?)
> ----------------------------------------------------------------
>
>                 Key: HBASE-11165
>                 URL: https://issues.apache.org/jira/browse/HBASE-11165
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: stack
>         Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf
>
>
> This discussion issue comes out of "Co-locate Meta And Master HBASE-10569" and comments
on the doc posted there.
> A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe
even 50M later.  This issue is about discussing how we will do that (or if not 50M on a cluster,
how otherwise we can attain same end).
> More detail to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message