Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (athena.apache.org: domain of todd@cloudera.com designates
 209.85.214.41 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <4F7D0B63.902@cyberagent.co.jp>
References: <4F71319A.6080407@cyberagent.co.jp>
 <4F72CB5E.1070802@cyberagent.co.jp>
 <52E301F960B30049ADEFBCCF1CCAEF590FCF088E@OAEXCH4SERVER.oa.oclc.org>
 <CADcMMgHmm6VWdyCNJgBcqKoYDxKF_ywyBKp2npQgstVxQ0Rmmw@mail.gmail.com>
 <52E301F960B30049ADEFBCCF1CCAEF590FCF0B70@OAEXCH4SERVER.oa.oclc.org>
 <4F73CAAB.2060403@cyberagent.co.jp>
 <CADcMMgGozc4JfMuu1QodJ87JMikZDdYL5Gue712U91kSPZ-Liw@mail.gmail.com>
 <CAF5-jt8GXrKvncBbgOGFsbe71NbXgcu2tnprjqFiEykE140CGQ@mail.gmail.com>
 <CAAha9a2TxGNm1PS1VNJaV1xC3=msYRV1uZer65BmyDO8KYeDOw@mail.gmail.com>
 <CAF5-jt8wD7YYaRorm_rKeGMvn99RMAGwdB-cwBOe_YCLQE7TQg@mail.gmail.com>
 <CAAha9a1WMuja68XADMe8DZDo1iB0UsxH9__7x7gjw+_tq=mutg@mail.gmail.com>
 <CADcMMgH4zN3MyQe7wHfmZgJYtw=tqXuDirXvc+W_+Mp8qkrLpg@mail.gmail.com>
 <CAAha9a2bmzokb+037he2CGV3QspAcxOnbUBhdxsJexggsEHqNA@mail.gmail.com>
 <CADcMMgFM+_La7zYnikPUeyr4ChQA_a1ug36+bmUFVBAaVt27+g@mail.gmail.com>
 <4F7CF936.7000800@cyberagent.co.jp>
 <CALte62yMVMoPV46HdRPYP72zz_JsMBovYYcL-LmJqAwDUYd_Dw@mail.gmail.com>
 <4F7D0B63.902@cyberagent.co.jp>
From: Todd Lipcon <todd@cloudera.com>
Date: Fri, 13 Apr 2012 20:02:43 -0700
Message-ID: 
 <CADY20s7+yGf_6pexO4E8DYDwYE8p885RNcM0arBQusiP3YYy7A@mail.gmail.com>
Subject: Re: 0.92 and Read/writes not scaling
To: user@hbase.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

To close the loop on this thread, we were able to track down the
issue. See https://issues.apache.org/jira/browse/HDFS-3280 - just
committed it in HDFS.

It's a simple patch if you want to patch your own build. Otherwise
this should show up in CDH4 nightly builds tonight, and I think in
CDH4b2 as well.

If you want to patch on the HBase side, you can edit HLog.java to
remove the checks for the "sync" method, and have it only call
"hflush". It's only the compatibility path that caused the problem.

Thanks
-Todd

On Wed, Apr 4, 2012 at 8:02 PM, Juhani Connolly
<juhani_connolly@cyberagent.co.jp> wrote:
> done, thanks for pointing me to that
>
>
> On 04/05/2012 11:43 AM, Ted Yu wrote:
>>
>> Juhani:
>> Thanks for sharing your results.
>>
>> Do you mind putting the summary on HBASE-5699: Run with> =A01 WAL in
>> HRegionServer ?
>>
>> On Wed, Apr 4, 2012 at 6:45 PM, Juhani Connolly<
>> juhani_connolly@cyberagent.co.jp> =A0wrote:
>>
>>> another quick update on stuff:
>>>
>>> since moving back to hdfs 0.20.2 (with hbase still at 0.92), we found
>>> that
>>> while we made significant gains in throughput, that most of our
>>> regionservers IPC threads were stuck somewhere in HWal.append(out of 50=
,
>>> 42
>>> were in append, of which 20 were in sync), limiting throughput despite
>>> significant free hardware resources.
>>>
>>> Because the WAL writes of a single =A0RS all go sequentially to one HDF=
S
>>> file, we assumed that we could improve throughput by separating writes =
to
>>> more WAL files and more HDs. To do this we ran multiple region servers =
on
>>> each node.
>>>
>>> The scaling =A0wasn't linear(we were in no way increasing hardware, jus=
t
>>> the
>>> number of regionservers), but we are now getting significantly more
>>> throughput.
>>> I would personally not say that this is a great approach to have to tak=
e,
>>> it would generally be better to build more smaller servers which will
>>> thus
>>> not limit themselves by trying to put a lot of data per server through =
a
>>> single WAL file.
>>>
>>> Of course there may be another solution to this that I'm not aware of? =
If
>>> so I'd love to hear it.
>>>
>


--=20
Todd Lipcon
Software Engineer, Cloudera