hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: HBase read perfomnance and HBase client
Date Tue, 30 Jul 2013 21:01:39 GMT
1. SCR are enabled
2. Single Configuration for all table did not work well, but I will try it
again
3. With Nagel I had 0.8ms avg, w/o - 0.4ms - I see the difference


On Tue, Jul 30, 2013 at 1:50 PM, lars hofhansl <larsh@apache.org> wrote:

> With Nagle's you'd see something around 40ms. You are not saying 0.8ms RTT
> is bad, right? Are you seeing ~40ms latencies?
>
> This thread has gotten confusing.
>
> I would try these:
> * one Configuration for all tables. Or even use a single
> HConnection/Threadpool and use the HTable(byte[], HConnection,
> ExecutorService) constructor
> * disable Nagle's: set both ipc.server.tcpnodelay and
> hbase.ipc.client.tcpnodelay to true in hbase-site.xml (both client *and*
> server)
> * increase hbase.client.ipc.pool.size in client's hbase-site.xml
> * enable short circuit reads (details depend on exact version of Hadoop).
> Google will help :)
>
> -- Lars
>
>
> ----- Original Message -----
> From: Vladimir Rodionov <vladrodionov@gmail.com>
> To: dev@hbase.apache.org
> Cc:
> Sent: Tuesday, July 30, 2013 1:30 PM
> Subject: Re: HBase read perfomnance and HBase client
>
> This hbase.ipc.client.tcpnodelay (default - false) explains poor single
> thread performance and high latency ( 0.8ms in local network)?
>
>
> On Tue, Jul 30, 2013 at 1:22 PM, Vladimir Rodionov
> <vladrodionov@gmail.com>wrote:
>
> > One more observation: One Configuration instance per HTable gives 50%
> > boost as compared to single Configuration object for all HTable's - from
> > 20K to 30K
> >
> >
> > On Tue, Jul 30, 2013 at 1:17 PM, Vladimir Rodionov <
> vladrodionov@gmail.com
> > > wrote:
> >
> >> This thread dump has been taken when client was sending 60 requests in
> >> parallel (at least, in theory). There are 50 server handler threads.
> >>
> >>
> >> On Tue, Jul 30, 2013 at 1:15 PM, Vladimir Rodionov <
> >> vladrodionov@gmail.com> wrote:
> >>
> >>> Sure, here it is:
> >>>
> >>> http://pastebin.com/8TjyrKRT
> >>>
> >>> epoll is not only to read/write HDFS but to connect/listen to clients
> as
> >>> well?
> >>>
> >>>
> >>> On Tue, Jul 30, 2013 at 12:31 PM, Jean-Daniel Cryans <
> >>> jdcryans@apache.org> wrote:
> >>>
> >>>> Can you show us what the thread dump looks like when the threads are
> >>>> BLOCKED? There aren't that many locks on the read path when reading
> >>>> out of the block cache, and epoll would only happen if you need to hit
> >>>> HDFS, which you're saying is not happening.
> >>>>
> >>>> J-D
> >>>>
> >>>> On Tue, Jul 30, 2013 at 12:16 PM, Vladimir Rodionov
> >>>> <vladrodionov@gmail.com> wrote:
> >>>> > I am hitting data in a block cache, of course. The data set is
very
> >>>> small
> >>>> > to fit comfortably into block cache and all request are directed
to
> >>>> the
> >>>> > same Region to guarantee single RS testing.
> >>>> >
> >>>> > To Ted:
> >>>> >
> >>>> > Yes, its CDH 4.3 . What the difference between 94.10 and 94.6 with
> >>>> respect
> >>>> > to read performance?
> >>>> >
> >>>> >
> >>>> > On Tue, Jul 30, 2013 at 12:06 PM, Jean-Daniel Cryans <
> >>>> jdcryans@apache.org>wrote:
> >>>> >
> >>>> >> That's a tough one.
> >>>> >>
> >>>> >> One thing that comes to mind is socket reuse. It used to come
up
> more
> >>>> >> more often but this is an issue that people hit when doing
loads of
> >>>> >> random reads. Try enabling tcp_tw_recycle but I'm not guaranteeing
> >>>> >> anything :)
> >>>> >>
> >>>> >> Also if you _just_ want to saturate something, be it CPU or
> network,
> >>>> >> wouldn't it be better to hit data only in the block cache?
This way
> >>>> it
> >>>> >> has the lowest overhead?
> >>>> >>
> >>>> >> Last thing I wanted to mention is that yes, the client doesn't
> scale
> >>>> >> very well. I would suggest you give the asynchbase client a
run.
> >>>> >>
> >>>> >> J-D
> >>>> >>
> >>>> >> On Tue, Jul 30, 2013 at 11:23 AM, Vladimir Rodionov
> >>>> >> <vrodionov@carrieriq.com> wrote:
> >>>> >> > I have been doing quite extensive testing of different
read
> >>>> scenarios:
> >>>> >> >
> >>>> >> > 1. blockcache disabled/enabled
> >>>> >> > 2. data is local/remote (no good hdfs locality)
> >>>> >> >
> >>>> >> > and it turned out that that I can not saturate 1 RS using
one
> >>>> >> (comparable in CPU power and RAM) client host:
> >>>> >> >
> >>>> >> >  I am running client app with 60 read threads active (with
> >>>> multi-get)
> >>>> >> that is going to one particular RS and
> >>>> >> > this RS's load is 100 -150% (out of 3200% available) -
it means
> >>>> that
> >>>> >> load is ~5%
> >>>> >> >
> >>>> >> > All threads in RS are either in BLOCKED (wait) or in IN_NATIVE
> >>>> states
> >>>> >> (epoll)
> >>>> >> >
> >>>> >> > I attribute this  to the HBase client implementation which
seems
> >>>> to be
> >>>> >> not scalable (I am going dig into client later on today).
> >>>> >> >
> >>>> >> > Some numbers: The maximum what I could get from Single
get (60
> >>>> threads):
> >>>> >> 30K per sec. Multiget gives ~ 75K (60 threads)
> >>>> >> >
> >>>> >> > What are my options? I want to measure the limits and
I do not
> >>>> want to
> >>>> >> run Cluster of clients against just ONE Region Server?
> >>>> >> >
> >>>> >> > RS config: 96GB RAM, 16(32) CPU
> >>>> >> > Client     : 48GB RAM   8 (16) CPU
> >>>> >> >
> >>>> >> > Best regards,
> >>>> >> > Vladimir Rodionov
> >>>> >> > Principal Platform Engineer
> >>>> >> > Carrier IQ, www.carrieriq.com
> >>>> >> > e-mail: vrodionov@carrieriq.com
> >>>> >> >
> >>>> >> >
> >>>> >> > Confidentiality Notice:  The information contained in
this
> message,
> >>>> >> including any attachments hereto, may be confidential and is
> >>>> intended to be
> >>>> >> read only by the individual or entity to whom this message
is
> >>>> addressed. If
> >>>> >> the reader of this message is not the intended recipient or
an
> agent
> >>>> or
> >>>> >> designee of the intended recipient, please note that any review,
> use,
> >>>> >> disclosure or distribution of this message or its attachments,
in
> >>>> any form,
> >>>> >> is strictly prohibited.  If you have received this message
in
> error,
> >>>> please
> >>>> >> immediately notify the sender and/or Notifications@carrieriq.comand
> >>>> >> delete or destroy any copy of this message and its attachments.
> >>>> >>
> >>>>
> >>>
> >>>
> >>
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message