hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tsuna <tsuna...@gmail.com>
Subject Re: 0.90 latency performance, cdh3b4
Date Fri, 22 Apr 2011 06:11:57 GMT
On Thu, Apr 21, 2011 at 10:49 PM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:
> What doesn't seem so fast is RPC. As i reported before, i was getting
> 25ms TTLB under the circumstances. In this case all the traffic to the
> node goes thru same client (but in reality of course the node's
> portion per client should be much less). All that traffic is using
> single regionserver node rpc queue as HConnection would not open more
> than one socket to same region. And tcp doesn't seem to perform very
> well for some reason in this scenario.

I doubt that TCP doesn't perform well.  If you really believe so, can
you provide a packet capture collected with:
sudo tcpdump -nvi eth0 -s0 -w /tmp/pcap port 60020

> So, it seems to help to actually open multiple hbase connections and
> round-robin them between scans. that way even though we waste more
> zookeeper connections, we also have more than one rpc channel open for
> the high-traffic region as well. A little coding and it brings us down
> from 25ms to 18ms average at 500QPS per region and 3 pooled hbase
> connections  Perhaps normally it is not as much a problem as traffic
> is more uniformly distributed among regions from the same client.

Would you be open to trying asynchbase in your test case /
application?  I haven't seen a case (yet) where you actually *need* to
open multiple connections per RegionServer.  I expect that if the
problem is an inefficiency in the HBase client, asynchbase might do
better, and if it does not, its DEBUG logging level might shed some
light on where the problem comes from.

https://github.com/stumbleupon/asynchbase

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Mime
View raw message