hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: HBase read perfomnance and HBase client
Date Tue, 30 Jul 2013 20:58:38 GMT
With :

hbase.ipc.client.tcpnodelay= true
hbase.client.ipc.pool.size =5

I was able to achieve 50K per sec for single get operations. No progress
for multi-gets.


On Tue, Jul 30, 2013 at 1:52 PM, Vladimir Rodionov
<vladrodionov@gmail.com>wrote:

> Exactly, but this thread dump is from RS under load nevertheless (you can
> see that one thread is in JAVA and reading data from socket)
>
>
> On Tue, Jul 30, 2013 at 1:35 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:
>
>> FWIW nothing is happening in that thread dump.
>>
>> J-D
>>
>> On Tue, Jul 30, 2013 at 1:15 PM, Vladimir Rodionov
>> <vladrodionov@gmail.com> wrote:
>> > Sure, here it is:
>> >
>> > http://pastebin.com/8TjyrKRT
>> >
>> > epoll is not only to read/write HDFS but to connect/listen to clients as
>> > well?
>> >
>> >
>> > On Tue, Jul 30, 2013 at 12:31 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org>wrote:
>> >
>> >> Can you show us what the thread dump looks like when the threads are
>> >> BLOCKED? There aren't that many locks on the read path when reading
>> >> out of the block cache, and epoll would only happen if you need to hit
>> >> HDFS, which you're saying is not happening.
>> >>
>> >> J-D
>> >>
>> >> On Tue, Jul 30, 2013 at 12:16 PM, Vladimir Rodionov
>> >> <vladrodionov@gmail.com> wrote:
>> >> > I am hitting data in a block cache, of course. The data set is very
>> small
>> >> > to fit comfortably into block cache and all request are directed to
>> the
>> >> > same Region to guarantee single RS testing.
>> >> >
>> >> > To Ted:
>> >> >
>> >> > Yes, its CDH 4.3 . What the difference between 94.10 and 94.6 with
>> >> respect
>> >> > to read performance?
>> >> >
>> >> >
>> >> > On Tue, Jul 30, 2013 at 12:06 PM, Jean-Daniel Cryans <
>> >> jdcryans@apache.org>wrote:
>> >> >
>> >> >> That's a tough one.
>> >> >>
>> >> >> One thing that comes to mind is socket reuse. It used to come up
>> more
>> >> >> more often but this is an issue that people hit when doing loads
of
>> >> >> random reads. Try enabling tcp_tw_recycle but I'm not guaranteeing
>> >> >> anything :)
>> >> >>
>> >> >> Also if you _just_ want to saturate something, be it CPU or network,
>> >> >> wouldn't it be better to hit data only in the block cache? This
way
>> it
>> >> >> has the lowest overhead?
>> >> >>
>> >> >> Last thing I wanted to mention is that yes, the client doesn't
scale
>> >> >> very well. I would suggest you give the asynchbase client a run.
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Tue, Jul 30, 2013 at 11:23 AM, Vladimir Rodionov
>> >> >> <vrodionov@carrieriq.com> wrote:
>> >> >> > I have been doing quite extensive testing of different read
>> scenarios:
>> >> >> >
>> >> >> > 1. blockcache disabled/enabled
>> >> >> > 2. data is local/remote (no good hdfs locality)
>> >> >> >
>> >> >> > and it turned out that that I can not saturate 1 RS using
one
>> >> >> (comparable in CPU power and RAM) client host:
>> >> >> >
>> >> >> >  I am running client app with 60 read threads active (with
>> multi-get)
>> >> >> that is going to one particular RS and
>> >> >> > this RS's load is 100 -150% (out of 3200% available) - it
means
>> that
>> >> >> load is ~5%
>> >> >> >
>> >> >> > All threads in RS are either in BLOCKED (wait) or in IN_NATIVE
>> states
>> >> >> (epoll)
>> >> >> >
>> >> >> > I attribute this  to the HBase client implementation which
seems
>> to be
>> >> >> not scalable (I am going dig into client later on today).
>> >> >> >
>> >> >> > Some numbers: The maximum what I could get from Single get
(60
>> >> threads):
>> >> >> 30K per sec. Multiget gives ~ 75K (60 threads)
>> >> >> >
>> >> >> > What are my options? I want to measure the limits and I do
not
>> want to
>> >> >> run Cluster of clients against just ONE Region Server?
>> >> >> >
>> >> >> > RS config: 96GB RAM, 16(32) CPU
>> >> >> > Client     : 48GB RAM   8 (16) CPU
>> >> >> >
>> >> >> > Best regards,
>> >> >> > Vladimir Rodionov
>> >> >> > Principal Platform Engineer
>> >> >> > Carrier IQ, www.carrieriq.com
>> >> >> > e-mail: vrodionov@carrieriq.com
>> >> >> >
>> >> >> >
>> >> >> > Confidentiality Notice:  The information contained in this
>> message,
>> >> >> including any attachments hereto, may be confidential and is
>> intended
>> >> to be
>> >> >> read only by the individual or entity to whom this message is
>> >> addressed. If
>> >> >> the reader of this message is not the intended recipient or an
>> agent or
>> >> >> designee of the intended recipient, please note that any review,
>> use,
>> >> >> disclosure or distribution of this message or its attachments,
in
>> any
>> >> form,
>> >> >> is strictly prohibited.  If you have received this message in error,
>> >> please
>> >> >> immediately notify the sender and/or Notifications@carrieriq.comand
>> >> >> delete or destroy any copy of this message and its attachments.
>> >> >>
>> >>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message