hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Corgan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9399) Up the memstore flush size
Date Tue, 17 Dec 2013 05:35:09 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13850125#comment-13850125

Matt Corgan commented on HBASE-9399:

It could be cool to flush the memstore into the block cache periodically.  The block cache
would hold in-memory copies of hfiles, where the blocks are labeled as transient so they don't
get evicted.  Several in-memory hfiles could build up in the block cache before a flush that
merges them together while writing to disk (or while writing back to the block cache).  This
would reduce the memory footprint of the data by eliminating significant CSLM overhead, and
it could be further reduced with block encoding.  It would also let us give a greater % of
the memory to the block cache where the eviction algorithm can do better prioritization of
what should be evicted.  Maybe some regions can grow to 2GB of transient memstore blocks while
other regions are persisted at 64MB.

(sorry this is out of place on this jira)

> Up the memstore flush size
> --------------------------
>                 Key: HBASE-9399
>                 URL: https://issues.apache.org/jira/browse/HBASE-9399
>             Project: HBase
>          Issue Type: Task
>          Components: regionserver
>    Affects Versions: 0.98.0, 0.96.0
>            Reporter: Elliott Clark
>            Assignee: Elliott Clark
>             Fix For: 0.98.0
> As heap sizes get bigger we are still recommending that users keep their number of regions
to a minimum.  This leads to lots of un-used memstore memory.
> For example I have a region server with 48 gigs of ram.  30 gigs are there for the region
server.  This with current defaults the global memstore size reserved is 8 gigs.
> The per region memstore size is 128mb right now.  That means that I need 80 regions actively
taking writes to reach the global memstore size.  That number is way out of line with what
our split policies currently give users.  They are given much fewer regions by default.
> We should up the hbase.hregion.memstore.flush.size size.  Ideally we should auto tune
everything.  But until then I think something like 512mb would help a lot with our write throughput
on clusters that don't have several hundred regions per RS.

This message was sent by Atlassian JIRA

View raw message