hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: HBase read perfomnance and HBase client
Date Thu, 01 Aug 2013 20:25:09 GMT
Micheal, network is not a bottleneck as since raw KV size is 62 bytes. 1GbE
can pump > 1 M per sec of these objects.

block cache is enabled, size ~ 2GB, query data set is less than 1MB, block
cache hit rate 99% (I think its 99.99% in reality)


On Thu, Aug 1, 2013 at 12:10 PM, Michael Segel <msegel_hadoop@hotmail.com>wrote:

> Ok... Bonded 1GbE is less than 2GbE, not sure of actual max throughput.
>
> Are you hitting data in cache or are you fetching data from disk?
> I mean can we rule out disk I/O because the data would most likely be in
> cache?
>
> Are you monitoring your cluster w Ganglia? What do you see in terms of
> network traffic?
> Are all of the nodes in the test cluster on the same switch? Including the
> client?
>
>
> (Sorry, I'm currently looking at a network problem so now everything I see
> may be a networking problem. And a guy from Arista found me after our
> meetup last night so I am thinking about the impact on networking in the
> ecosystem. :-).  )
>
>
> -Just some guy out in left field...
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Aug 1, 2013, at 1:11 PM, "Vladimir Rodionov" <vladrodionov@gmail.com>
> wrote:
>
> > 2x1Gb bonded, I think. This is our standard config.
> >
> >
> > On Thu, Aug 1, 2013 at 10:27 AM, Michael Segel <
> msegel_hadoop@hotmail.com>wrote:
> >
> >> Network? 1GbE or 10GbE?
> >>
> >> Sent from a remote device. Please excuse any typos...
> >>
> >> Mike Segel
> >>
> >> On Jul 31, 2013, at 9:27 PM, "Vladimir Rodionov" <
> vladrodionov@gmail.com>
> >> wrote:
> >>
> >>> Some final numbers :
> >>>
> >>> Test config:
> >>>
> >>> HBase 0.94.6
> >>> blockcache=true, block size = 64K, KV size = 62 bytes (raw).
> >>>
> >>> 5 Clients: 96GB, 16(32) CPUs (2.2Ghz), CentOS 5.7
> >>> 1 RS Server: the same config.
> >>>
> >>> Local network with ping between hosts: 0.1 ms
> >>>
> >>>
> >>> 1. HBase client hits the wall at ~ 50K per sec regardless of # of CPU,
> >>> threads, IO pool size and other settings.
> >>> 2. HBase server was able to sustain 170K per sec (with 64K block size).
> >> All
> >>> from block cache. KV size = 62 bytes (very small). This is for single
> Get
> >>> op, 60 threads per client, 5 clients (on different hosts)
> >>> 3. Multi - get hits the wall at the same 170K-200K per sec. Batch size
> >>> tested: 30, 100. The same performance absolutely as with batch size =
> 1.
> >>> Multi get has some internal issues on RegionServer side. May be
> excessive
> >>> locking or some thing else.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Jul 30, 2013 at 2:01 PM, Vladimir Rodionov
> >>> <vladrodionov@gmail.com>wrote:
> >>>
> >>>> 1. SCR are enabled
> >>>> 2. Single Configuration for all table did not work well, but I will
> try
> >> it
> >>>> again
> >>>> 3. With Nagel I had 0.8ms avg, w/o - 0.4ms - I see the difference
> >>>>
> >>>>
> >>>> On Tue, Jul 30, 2013 at 1:50 PM, lars hofhansl <larsh@apache.org>
> >> wrote:
> >>>>
> >>>>> With Nagle's you'd see something around 40ms. You are not saying
> 0.8ms
> >>>>> RTT is bad, right? Are you seeing ~40ms latencies?
> >>>>>
> >>>>> This thread has gotten confusing.
> >>>>>
> >>>>> I would try these:
> >>>>> * one Configuration for all tables. Or even use a single
> >>>>> HConnection/Threadpool and use the HTable(byte[], HConnection,
> >>>>> ExecutorService) constructor
> >>>>> * disable Nagle's: set both ipc.server.tcpnodelay and
> >>>>> hbase.ipc.client.tcpnodelay to true in hbase-site.xml (both client
> >> *and*
> >>>>> server)
> >>>>> * increase hbase.client.ipc.pool.size in client's hbase-site.xml
> >>>>> * enable short circuit reads (details depend on exact version of
> >> Hadoop).
> >>>>> Google will help :)
> >>>>>
> >>>>> -- Lars
> >>>>>
> >>>>>
> >>>>> ----- Original Message -----
> >>>>> From: Vladimir Rodionov <vladrodionov@gmail.com>
> >>>>> To: dev@hbase.apache.org
> >>>>> Cc:
> >>>>> Sent: Tuesday, July 30, 2013 1:30 PM
> >>>>> Subject: Re: HBase read perfomnance and HBase client
> >>>>>
> >>>>> This hbase.ipc.client.tcpnodelay (default - false) explains poor
> single
> >>>>> thread performance and high latency ( 0.8ms in local network)?
> >>>>>
> >>>>>
> >>>>> On Tue, Jul 30, 2013 at 1:22 PM, Vladimir Rodionov
> >>>>> <vladrodionov@gmail.com>wrote:
> >>>>>
> >>>>>> One more observation: One Configuration instance per HTable
gives
> 50%
> >>>>>> boost as compared to single Configuration object for all HTable's
-
> >> from
> >>>>>> 20K to 30K
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Jul 30, 2013 at 1:17 PM, Vladimir Rodionov <
> >>>>> vladrodionov@gmail.com
> >>>>>>> wrote:
> >>>>>>
> >>>>>>> This thread dump has been taken when client was sending
60 requests
> >> in
> >>>>>>> parallel (at least, in theory). There are 50 server handler
> threads.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Jul 30, 2013 at 1:15 PM, Vladimir Rodionov <
> >>>>>>> vladrodionov@gmail.com> wrote:
> >>>>>>>
> >>>>>>>> Sure, here it is:
> >>>>>>>>
> >>>>>>>> http://pastebin.com/8TjyrKRT
> >>>>>>>>
> >>>>>>>> epoll is not only to read/write HDFS but to connect/listen
to
> >> clients
> >>>>> as
> >>>>>>>> well?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Tue, Jul 30, 2013 at 12:31 PM, Jean-Daniel Cryans
<
> >>>>>>>> jdcryans@apache.org> wrote:
> >>>>>>>>
> >>>>>>>>> Can you show us what the thread dump looks like
when the threads
> >> are
> >>>>>>>>> BLOCKED? There aren't that many locks on the read
path when
> reading
> >>>>>>>>> out of the block cache, and epoll would only happen
if you need
> to
> >>>>> hit
> >>>>>>>>> HDFS, which you're saying is not happening.
> >>>>>>>>>
> >>>>>>>>> J-D
> >>>>>>>>>
> >>>>>>>>> On Tue, Jul 30, 2013 at 12:16 PM, Vladimir Rodionov
> >>>>>>>>> <vladrodionov@gmail.com> wrote:
> >>>>>>>>>> I am hitting data in a block cache, of course.
The data set is
> >> very
> >>>>>>>>> small
> >>>>>>>>>> to fit comfortably into block cache and all
request are directed
> >> to
> >>>>>>>>> the
> >>>>>>>>>> same Region to guarantee single RS testing.
> >>>>>>>>>>
> >>>>>>>>>> To Ted:
> >>>>>>>>>>
> >>>>>>>>>> Yes, its CDH 4.3 . What the difference between
94.10 and 94.6
> with
> >>>>>>>>> respect
> >>>>>>>>>> to read performance?
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Jul 30, 2013 at 12:06 PM, Jean-Daniel
Cryans <
> >>>>>>>>> jdcryans@apache.org>wrote:
> >>>>>>>>>>
> >>>>>>>>>>> That's a tough one.
> >>>>>>>>>>>
> >>>>>>>>>>> One thing that comes to mind is socket reuse.
It used to come
> up
> >>>>> more
> >>>>>>>>>>> more often but this is an issue that people
hit when doing
> loads
> >>>>> of
> >>>>>>>>>>> random reads. Try enabling tcp_tw_recycle
but I'm not
> >> guaranteeing
> >>>>>>>>>>> anything :)
> >>>>>>>>>>>
> >>>>>>>>>>> Also if you _just_ want to saturate something,
be it CPU or
> >>>>> network,
> >>>>>>>>>>> wouldn't it be better to hit data only in
the block cache? This
> >>>>> way
> >>>>>>>>> it
> >>>>>>>>>>> has the lowest overhead?
> >>>>>>>>>>>
> >>>>>>>>>>> Last thing I wanted to mention is that yes,
the client doesn't
> >>>>> scale
> >>>>>>>>>>> very well. I would suggest you give the
asynchbase client a
> run.
> >>>>>>>>>>>
> >>>>>>>>>>> J-D
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Jul 30, 2013 at 11:23 AM, Vladimir
Rodionov
> >>>>>>>>>>> <vrodionov@carrieriq.com> wrote:
> >>>>>>>>>>>> I have been doing quite extensive testing
of different read
> >>>>>>>>> scenarios:
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1. blockcache disabled/enabled
> >>>>>>>>>>>> 2. data is local/remote (no good hdfs
locality)
> >>>>>>>>>>>>
> >>>>>>>>>>>> and it turned out that that I can not
saturate 1 RS using one
> >>>>>>>>>>> (comparable in CPU power and RAM) client
host:
> >>>>>>>>>>>>
> >>>>>>>>>>>> I am running client app with 60 read
threads active (with
> >>>>>>>>> multi-get)
> >>>>>>>>>>> that is going to one particular RS and
> >>>>>>>>>>>> this RS's load is 100 -150% (out of
3200% available) - it
> means
> >>>>>>>>> that
> >>>>>>>>>>> load is ~5%
> >>>>>>>>>>>>
> >>>>>>>>>>>> All threads in RS are either in BLOCKED
(wait) or in IN_NATIVE
> >>>>>>>>> states
> >>>>>>>>>>> (epoll)
> >>>>>>>>>>>>
> >>>>>>>>>>>> I attribute this  to the HBase client
implementation which
> seems
> >>>>>>>>> to be
> >>>>>>>>>>> not scalable (I am going dig into client
later on today).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Some numbers: The maximum what I could
get from Single get (60
> >>>>>>>>> threads):
> >>>>>>>>>>> 30K per sec. Multiget gives ~ 75K (60 threads)
> >>>>>>>>>>>>
> >>>>>>>>>>>> What are my options? I want to measure
the limits and I do not
> >>>>>>>>> want to
> >>>>>>>>>>> run Cluster of clients against just ONE
Region Server?
> >>>>>>>>>>>>
> >>>>>>>>>>>> RS config: 96GB RAM, 16(32) CPU
> >>>>>>>>>>>> Client     : 48GB RAM   8 (16) CPU
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best regards,
> >>>>>>>>>>>> Vladimir Rodionov
> >>>>>>>>>>>> Principal Platform Engineer
> >>>>>>>>>>>> Carrier IQ, www.carrieriq.com
> >>>>>>>>>>>> e-mail: vrodionov@carrieriq.com
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Confidentiality Notice:  The information
contained in this
> >>>>> message,
> >>>>>>>>>>> including any attachments hereto, may be
confidential and is
> >>>>>>>>> intended to be
> >>>>>>>>>>> read only by the individual or entity to
whom this message is
> >>>>>>>>> addressed. If
> >>>>>>>>>>> the reader of this message is not the intended
recipient or an
> >>>>> agent
> >>>>>>>>> or
> >>>>>>>>>>> designee of the intended recipient, please
note that any
> review,
> >>>>> use,
> >>>>>>>>>>> disclosure or distribution of this message
or its attachments,
> in
> >>>>>>>>> any form,
> >>>>>>>>>>> is strictly prohibited.  If you have received
this message in
> >>>>> error,
> >>>>>>>>> please
> >>>>>>>>>>> immediately notify the sender and/or
> >> Notifications@carrieriq.comand
> >>>>>>>>>>> delete or destroy any copy of this message
and its attachments.
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message