hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hsieh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11985) Document sizing rules of thumb
Date Tue, 16 Sep 2014 19:00:34 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135971#comment-14135971
] 

Jonathan Hsieh commented on HBASE-11985:
----------------------------------------


Pending mob work is targeted for 100k-10mb range.  tbd to see if it is makes sens for 10mb-64mb.
 (if you have a few it would be ok, but if you have a lot).

A large number of *column families* probably means you are doing it wrong.



> Document sizing rules of thumb
> ------------------------------
>
>                 Key: HBASE-11985
>                 URL: https://issues.apache.org/jira/browse/HBASE-11985
>             Project: HBase
>          Issue Type: Task
>          Components: documentation
>            Reporter: Misty Stanley-Jones
>            Assignee: Misty Stanley-Jones
>
> I'm looking for tuning/sizing rules of thumb to put in the Ref Guide.
> Info I have gleaned so far:
> A reasonable region size is between 10 GB and 50 GB.
> A reasonable maximum cell size is 1 MB to 10 MB. If your cells are larger than 10 MB,
consider storing the cell contents in HDFS and storing a reference to the location in HBase.
Pending MOB work for 10 MB - 64 MB window.
> When you size your regions and cells, keep in mind that a region cannot split across
a row. If your row size is too large, or your region size is too small, you can end up with
a single row per region, which is not a good pattern. It is also possible that one big column
causes splits while other columns are tiny, and this may not be great.
> A large # of columns probably means you are doing it wrong.
> Column names need to be short because they get stored for every value (barring encoding).
Don't need to be self-documenting like in RDBMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message