hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: optimal size for Hbase.hregion.memstore.flush.size and its impact
Date Mon, 24 Aug 2015 17:55:22 GMT
Correction:

> 4. WAL is rolling prematurely, controlled by   hbase.regionserver.maxlogs
and  dfs.block.size.

Should read:
4. WAL is rolling, controlled by   hbase.regionserver.maxlogs and
 dfs.block.size.

-Vlad

On Mon, Aug 24, 2015 at 10:36 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> Related please see HBASE-13408 HBase In-Memory Memstore Compaction
>
> FYI
>
> On Mon, Aug 24, 2015 at 10:32 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
> > The split policy also uses the flush size to estimate how to split
> > tables...
> >
> > It's sometime fine to upgrade thise number a bit. Like, to 256MB. But 512
> > is pretty high.... And 800MB is even more.
> >
> > Big memstores takes more time to get flush and can block the writes if
> they
> > are not fast enough. If yours are fast enough, then you might be able to
> > stay with 512MB. I don't think 800MB is a good idea...
> >
> > JM
> >
> > 2015-08-24 13:23 GMT-04:00 Vladimir Rodionov <vladrodionov@gmail.com>:
> >
> > > 1. How many regions per RS?
> > > 2. What is your dfs.block.size?
> > > 3. What is your hbase.regionserver.maxlogs?
> > >
> > > Flush can be requested when:
> > >
> > > 1. Region size exceeds hbase.hregion.memstore.flush.size
> > > 2. Region's memstore is too old (periodic memstore flusher checks the
> age
> > > of memstore, default is 1hour) Controlled by
> > >     hbase.regionserver.optionalcacheflushinterval (in ms)
> > > 3. There too many unflushed changes in a Region. Controlled by
> > > hbase.regionserver.flush.per.changes, default is 30,000,000
> > > 4. WAL is rolling prematurely, controlled by
>  hbase.regionserver.maxlogs
> > > and  dfs.block.size.
> > >
> > > You calculate optimal: hbase.regionserver.maxlogs * dfs.block.size *
> > 0.95 >
> > > hbase.regionserver.global.memstore.upperLimit  * HBASE_HEAPSIZE
> > >
> > > I recommend you to enable DEBUG logging and analyze MemStoreFlusher,
> > > PeriodicMemstoreFlusher and HRegion flush related log messages to get
> > idea
> > > why flush was requested on a region(s), what was the region size at
> that
> > > time.
> > >
> > > I think, in your case it is either premature WAL rolling or too many
> > > changes in a memstore.
> > >
> > > -Vlad
> > >
> > >
> > > On Wed, May 27, 2015 at 1:53 PM, Gautam Borah <gautam.borah@gmail.com>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > The default size of Hbase.hregion.memstore.flush.size is define as
> 128
> > MB
> > > > for Hbase.hregion.memstore.flush.size. Could anyone kindly explain
> what
> > > > would be the impact if we increase this to a higher value 512 MB or
> 800
> > > MB
> > > > or higher.
> > > >
> > > > We have a very write heavy cluster. Also we run periodic end point co
> > > > processor based jobs that operate on the data written in the last
> 10-15
> > > > mins, every 10 minute. We are trying to manage the memstore flush
> > > > operations such that the hot data remains in memstore for at least
> > 30-40
> > > > mins or longer, so that the job hits disk every 3rd or 4th time it
> > tries
> > > to
> > > > operate on the hot data (it does scan).
> > > >
> > > > We have region server heap size of 20 GB and set the,
> > > >
> > > > hbase.regionserver.global.memstore.lowerLimit = .45
> > > > hbase.regionserver.global.memstore.upperLimit = .55
> > > >
> > > > We observed that if we set the
> Hbase.hregion.memstore.flush.size=128MB
> > > > only 10% of the heap is utilized by memstore, after that memstore
> > > flushes.
> > > >
> > > > At Hbase.hregion.memstore.flush.size=512MB, we are able to increase
> the
> > > > heap utelization to by memstore to 35%.
> > > >
> > > > It would be very helpful for us to understand the implication of
> higher
> > > > Hbase.hregion.memstore.flush.size  for a long running cluster.
> > > >
> > > > Thanks,
> > > > Gautam
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message