hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: 0.92 and Read/writes not scaling
Date Sat, 14 Apr 2012 03:02:43 GMT
To close the loop on this thread, we were able to track down the
issue. See https://issues.apache.org/jira/browse/HDFS-3280 - just
committed it in HDFS.

It's a simple patch if you want to patch your own build. Otherwise
this should show up in CDH4 nightly builds tonight, and I think in
CDH4b2 as well.

If you want to patch on the HBase side, you can edit HLog.java to
remove the checks for the "sync" method, and have it only call
"hflush". It's only the compatibility path that caused the problem.


On Wed, Apr 4, 2012 at 8:02 PM, Juhani Connolly
<juhani_connolly@cyberagent.co.jp> wrote:
> done, thanks for pointing me to that
> On 04/05/2012 11:43 AM, Ted Yu wrote:
>> Juhani:
>> Thanks for sharing your results.
>> Do you mind putting the summary on HBASE-5699: Run with>  1 WAL in
>> HRegionServer ?
>> On Wed, Apr 4, 2012 at 6:45 PM, Juhani Connolly<
>> juhani_connolly@cyberagent.co.jp>  wrote:
>>> another quick update on stuff:
>>> since moving back to hdfs 0.20.2 (with hbase still at 0.92), we found
>>> that
>>> while we made significant gains in throughput, that most of our
>>> regionservers IPC threads were stuck somewhere in HWal.append(out of 50,
>>> 42
>>> were in append, of which 20 were in sync), limiting throughput despite
>>> significant free hardware resources.
>>> Because the WAL writes of a single  RS all go sequentially to one HDFS
>>> file, we assumed that we could improve throughput by separating writes to
>>> more WAL files and more HDs. To do this we ran multiple region servers on
>>> each node.
>>> The scaling  wasn't linear(we were in no way increasing hardware, just
>>> the
>>> number of regionservers), but we are now getting significantly more
>>> throughput.
>>> I would personally not say that this is a great approach to have to take,
>>> it would generally be better to build more smaller servers which will
>>> thus
>>> not limit themselves by trying to put a lot of data per server through a
>>> single WAL file.
>>> Of course there may be another solution to this that I'm not aware of? If
>>> so I'd love to hear it.

Todd Lipcon
Software Engineer, Cloudera

View raw message