Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (nike.apache.org: domain of yuzhihong@gmail.com designates
 209.85.160.175 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CADcMMgGNN+cGu54FafxoeR5yE_eobOpuxJmw2yEmG7o++N=oOg@mail.gmail.com>
References: 
 <CAMP4i_Nut_Owoe26humhpv+_T_e2o-Uj6E_KrABnVjn_GL1v_Q@mail.gmail.com>
	<CADcMMgGNN+cGu54FafxoeR5yE_eobOpuxJmw2yEmG7o++N=oOg@mail.gmail.com>
Date: Fri, 31 Oct 2014 10:51:44 -0700
Message-ID: 
 <CALte62yVG7x47a=g7SG6eea=W_Tr8aOKxCHYJ0X8mrD9m7om=A@mail.gmail.com>
Subject: Re: Increasing write throughput..
From: Ted Yu <yuzhihong@gmail.com>
To: "user@hbase.apache.org" <user@hbase.apache.org>
Content-Type: multipart/alternative; boundary=089e0160a63e6e105d0506bba674

--089e0160a63e6e105d0506bba674
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Gautum:
bq. I'v attached a snapshot of the memstore size vs. flushQueueLen

That didn't go through.
Consider using third-party site.

Answers to Stack's questions would help us get more clue.

Cheers

On Fri, Oct 31, 2014 at 10:47 AM, Stack <stack@duboce.net> wrote:

> What version of hbase (later versions have improvements in write
> throughput, especially when many writing threads).  Post a pastebin of
> regionserver log in steadystate if you don't mind.  About how many writer=
s
> going into server at a time?  How many regions on server.  All being
> written to at same rate or you have hotties?
> Thanks,
> St.Ack
>
> On Fri, Oct 31, 2014 at 10:22 AM, Gautam <gautamkowshik@gmail.com> wrote:
>
> > I'm trying to increase write throughput of our hbase cluster. we'r
> > currently doing around 7500 messages per sec per node. I think we have
> room
> > for improvement. Especially since the heap is under utilized and memsto=
re
> > size doesn't seem to fluctuate much between regular and peak ingestion
> > loads.
> >
> > We mainly have one large table that we write most of the data to. Other
> > tables are mainly opentsdb and some relatively small summary tables. Th=
is
> > table is read in batch once a day but otherwise is mostly serving write=
s
> > 99% of the time. This large table has 1 CF and get's flushed at around
> > ~128M fairly regularly like below..
> >
> > {log}
> >
> > 2014-10-31 16:56:09,499 INFO
> org.apache.hadoop.hbase.regionserver.HRegion:
> > Finished memstore flush of ~128.2 M/134459888, currentsize=3D879.5 K/90=
0640
> > for region
> >
> msg,00102014100515impression\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\=
x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x002014100515040200=
049358\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\=
x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x004138647301\x00\x00\x=
00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\=
x00\x00\x00\x00\x0002e5a329d2171149bcc1e83ed129312b\x00\x00\x00\x00,1413909=
604591.828e03c0475b699278256d4b5b9638a2.
> > in 640ms, sequenceid=3D16861176169, compaction requested=3Dtrue
> >
> > {log}
> >
> > Here's a pastebin of my hbase site : http://pastebin.com/fEctQ3im
> >
> > What i'v tried..
> > -  turned of major compactions , and handling these manually.
> > -  bumped up heap Xmx from 24G to 48 G
> > -  hbase.hregion.memstore.flush.size =3D 512M
> > - lowerLimit/ upperLimit on memstore are defaults (0.38 , 0.4) since th=
e
> > global heap has enough space to accommodate the default percentages.
> >  - Currently running Hbase 98.1 on an 8 node cluster that's scaled up t=
o
> > 128GB RAM.
> >
> >
> > There hasn't been any appreciable increase in write perf. Still hoverin=
g
> > around the 7500 per node write throughput number. The flushes still see=
m
> to
> > be hapenning at 128M (instead of the expected 512)
> >
> > I'v attached a snapshot of the memstore size vs. flushQueueLen. the blo=
ck
> > caches are utilizing the extra heap space but not the memstore. The flu=
sh
> > Queue lengths have increased which leads me to believe that it's flushi=
ng
> > way too often without any increase in throughput.
> >
> > Please let me know where i should dig further. That's a long email,
> thanks
> > for reading through :-)
> >
> >
> >
> > Cheers,
> > -Gautam.
> >
>

--089e0160a63e6e105d0506bba674--