hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Esteban Gutierrez <este...@cloudera.com>
Subject Re: impact of using higher Hbase.hregion.memstore.flush.size=512MB
Date Wed, 27 May 2015 21:15:36 GMT
Gautam,

Yes, you can increase the size of the memstore to values larger to 128MB
but usually you go by increasing hbase.hregion.memstore.block.multiplier
only. Depending on the version of HBase you are running many things can
happen, e.g. multiple memstores can be flushed at once and/or the memstores
will be flushed if there are some rows in memory (30 million) or if the
store hasn't been flushed in an hour, the rate of the flushes can be tuned
and also if you are hitting the max number of HLogs that can trigger a
flush. One problem  running with large memstores is mostly how many regions
you will have per RS and if using some encoding and/or compression codec is
being used might cause the flush to take longer or use more CPU resources
or push back clients b/c you haven't flushed some regions to disk.

Based on the the behavior that you have described on the heap utilization
sounds like you are not fully utilizing the memstores and you are below the
lower limit, so depending on the version of HBase and available resources
you might want to use hbase.rs.cacheblocksonwrite instead to keep some of
the hot data in the block cache.

cheers,
esteban.




--
Cloudera, Inc.


On Wed, May 27, 2015 at 1:58 PM, Gautam Borah <gborah@appdynamics.com>
wrote:

> Hi all,
>
> The default size of Hbase.hregion.memstore.flush.size is define as 128 MB .
> Could anyone kindly explain what would be the impact if we increase this to
> a higher value 512 MB or 800 MB or higher.
>
> We have a very write heavy cluster. Also we run periodic end point co
> processor based jobs that operate on the data written in the last 10-15
> mins, every 10 minute. We are trying to manage the memstore flush
> operations such that the hot data remains in memstore for at least 30-40
> mins or longer, so that the job hits disk every 3rd or 4th time it tries to
> operate on the hot data (it does scan).
>
> We have region server heap size of 20 GB and set the,
>
> hbase.regionserver.global.memstore.lowerLimit = .45
>
> hbase.regionserver.global.memstore.upperLimit = .55
>
> We observed that if we set the Hbase.hregion.memstore.flush.size=128MB only
> 10% of the heap is utilized by memstore, after that memstore flushes.
>
> At Hbase.hregion.memstore.flush.size=512MB, we are able to increase the
> heap utilization to by memstore to 35%.
>
> It would be very helpful for us to understand the implication of higher
> Hbase.hregion.memstore.flush.size  for a long running cluster.
>
> Thanks,
>
> Gautam
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message