hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anoop John <anoop.hb...@gmail.com>
Subject Re: Increasing write throughput..
Date Mon, 03 Nov 2014 01:46:46 GMT
You have ~280 regions per RS.
And ur memstore size % is 40% and heap size 48GB
This mean the heap size for memstore is 48 * 0.4 = 19.2GB  ( I am just
considering the upper water mark alone)

If u have to consider all 280 regions each with 512 MB heap you need much
more size of heap.   And your writes are distributed to all regions right?

So you will be seeing flushes because of global heap pressure.

Increasing the xmx and flush size alone wont help.  You need to consider
the regions# and writes

When you tune this the next step will be to tune the HLog and its rolling.
That depends on your cell size as well.
By default when we reach 95% size of HDFS block size, we roll to a new HLog
file. And by default when we reach 32 Log files, we force flushes.  FYI.

-Anoop-


On Sat, Nov 1, 2014 at 10:54 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Please read 9.7.7.2. MemStoreFlush under
> http://hbase.apache.org/book.html#regions.arch
>
> Cheers
>
> On Fri, Oct 31, 2014 at 11:16 AM, Gautam Kowshik <gautamkowshik@gmail.com>
> wrote:
>
> > - Sorry bout the raw image upload, here’s the tsdb snapshot :
> > http://postimg.org/image/gq4nf96x9/
> > - Hbase version 98.1 (CDH 5.1 distro)
> > - hbase-site pastebin : http://pastebin.com/fEctQ3im
> > - this table ‘msg' has been pre-split with 240 regions and writes are
> > evenly distributed into 240 buckets. ( the bucket is a prefix to the row
> > key ) . These regions are well spread across the 8 RSs. Although over
> time
> > these 240 have split and now become 2440 .. each region server has ~280
> > regions.
> > - last 500 lines of log from one RS : http://pastebin.com/8MwYMZPb Al
> > - no hot regions from what i can tell.
> >
> > One of my main concerns was why even after setting the memstore flush
> size
> > to 512M is it still flushing at 128M. Is there a setting i’v missed ? I’l
> > try to get more details as i find them.
> >
> > Thanks and Cheers,
> > -Gautam.
> >
> > On Oct 31, 2014, at 10:47 AM, Stack <stack@duboce.net> wrote:
> >
> > > What version of hbase (later versions have improvements in write
> > > throughput, especially when many writing threads).  Post a pastebin of
> > > regionserver log in steadystate if you don't mind.  About how many
> > writers
> > > going into server at a time?  How many regions on server.  All being
> > > written to at same rate or you have hotties?
> > > Thanks,
> > > St.Ack
> > >
> > > On Fri, Oct 31, 2014 at 10:22 AM, Gautam <gautamkowshik@gmail.com>
> > wrote:
> > >
> > >> I'm trying to increase write throughput of our hbase cluster. we'r
> > >> currently doing around 7500 messages per sec per node. I think we have
> > room
> > >> for improvement. Especially since the heap is under utilized and
> > memstore
> > >> size doesn't seem to fluctuate much between regular and peak ingestion
> > >> loads.
> > >>
> > >> We mainly have one large table that we write most of the data to.
> Other
> > >> tables are mainly opentsdb and some relatively small summary tables.
> > This
> > >> table is read in batch once a day but otherwise is mostly serving
> writes
> > >> 99% of the time. This large table has 1 CF and get's flushed at around
> > >> ~128M fairly regularly like below..
> > >>
> > >> {log}
> > >>
> > >> 2014-10-31 16:56:09,499 INFO
> > org.apache.hadoop.hbase.regionserver.HRegion:
> > >> Finished memstore flush of ~128.2 M/134459888, currentsize=879.5
> > K/900640
> > >> for region
> > >>
> >
> msg,00102014100515impression\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x002014100515040200049358\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x004138647301\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0002e5a329d2171149bcc1e83ed129312b\x00\x00\x00\x00,1413909604591.828e03c0475b699278256d4b5b9638a2.
> > >> in 640ms, sequenceid=16861176169, compaction requested=true
> > >>
> > >> {log}
> > >>
> > >> Here's a pastebin of my hbase site : http://pastebin.com/fEctQ3im
> > >>
> > >> What i'v tried..
> > >> -  turned of major compactions , and handling these manually.
> > >> -  bumped up heap Xmx from 24G to 48 G
> > >> -  hbase.hregion.memstore.flush.size = 512M
> > >> - lowerLimit/ upperLimit on memstore are defaults (0.38 , 0.4) since
> the
> > >> global heap has enough space to accommodate the default percentages.
> > >> - Currently running Hbase 98.1 on an 8 node cluster that's scaled up
> to
> > >> 128GB RAM.
> > >>
> > >>
> > >> There hasn't been any appreciable increase in write perf. Still
> hovering
> > >> around the 7500 per node write throughput number. The flushes still
> > seem to
> > >> be hapenning at 128M (instead of the expected 512)
> > >>
> > >> I'v attached a snapshot of the memstore size vs. flushQueueLen. the
> > block
> > >> caches are utilizing the extra heap space but not the memstore. The
> > flush
> > >> Queue lengths have increased which leads me to believe that it's
> > flushing
> > >> way too often without any increase in throughput.
> > >>
> > >> Please let me know where i should dig further. That's a long email,
> > thanks
> > >> for reading through :-)
> > >>
> > >>
> > >>
> > >> Cheers,
> > >> -Gautam.
> > >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message