hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <msegel_had...@hotmail.com>
Subject Re: HBase read perfomnance and HBase client
Date Thu, 01 Aug 2013 19:10:43 GMT
Ok... Bonded 1GbE is less than 2GbE, not sure of actual max throughput. 

Are you hitting data in cache or are you fetching data from disk?
I mean can we rule out disk I/O because the data would most likely be in cache?

Are you monitoring your cluster w Ganglia? What do you see in terms of network traffic?
Are all of the nodes in the test cluster on the same switch? Including the client?


(Sorry, I'm currently looking at a network problem so now everything I see may be a networking
problem. And a guy from Arista found me after our meetup last night so I am thinking about
the impact on networking in the ecosystem. :-).  )


-Just some guy out in left field... 

Sent from a remote device. Please excuse any typos...

Mike Segel

On Aug 1, 2013, at 1:11 PM, "Vladimir Rodionov" <vladrodionov@gmail.com> wrote:

> 2x1Gb bonded, I think. This is our standard config.
> 
> 
> On Thu, Aug 1, 2013 at 10:27 AM, Michael Segel <msegel_hadoop@hotmail.com>wrote:
> 
>> Network? 1GbE or 10GbE?
>> 
>> Sent from a remote device. Please excuse any typos...
>> 
>> Mike Segel
>> 
>> On Jul 31, 2013, at 9:27 PM, "Vladimir Rodionov" <vladrodionov@gmail.com>
>> wrote:
>> 
>>> Some final numbers :
>>> 
>>> Test config:
>>> 
>>> HBase 0.94.6
>>> blockcache=true, block size = 64K, KV size = 62 bytes (raw).
>>> 
>>> 5 Clients: 96GB, 16(32) CPUs (2.2Ghz), CentOS 5.7
>>> 1 RS Server: the same config.
>>> 
>>> Local network with ping between hosts: 0.1 ms
>>> 
>>> 
>>> 1. HBase client hits the wall at ~ 50K per sec regardless of # of CPU,
>>> threads, IO pool size and other settings.
>>> 2. HBase server was able to sustain 170K per sec (with 64K block size).
>> All
>>> from block cache. KV size = 62 bytes (very small). This is for single Get
>>> op, 60 threads per client, 5 clients (on different hosts)
>>> 3. Multi - get hits the wall at the same 170K-200K per sec. Batch size
>>> tested: 30, 100. The same performance absolutely as with batch size = 1.
>>> Multi get has some internal issues on RegionServer side. May be excessive
>>> locking or some thing else.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Jul 30, 2013 at 2:01 PM, Vladimir Rodionov
>>> <vladrodionov@gmail.com>wrote:
>>> 
>>>> 1. SCR are enabled
>>>> 2. Single Configuration for all table did not work well, but I will try
>> it
>>>> again
>>>> 3. With Nagel I had 0.8ms avg, w/o - 0.4ms - I see the difference
>>>> 
>>>> 
>>>> On Tue, Jul 30, 2013 at 1:50 PM, lars hofhansl <larsh@apache.org>
>> wrote:
>>>> 
>>>>> With Nagle's you'd see something around 40ms. You are not saying 0.8ms
>>>>> RTT is bad, right? Are you seeing ~40ms latencies?
>>>>> 
>>>>> This thread has gotten confusing.
>>>>> 
>>>>> I would try these:
>>>>> * one Configuration for all tables. Or even use a single
>>>>> HConnection/Threadpool and use the HTable(byte[], HConnection,
>>>>> ExecutorService) constructor
>>>>> * disable Nagle's: set both ipc.server.tcpnodelay and
>>>>> hbase.ipc.client.tcpnodelay to true in hbase-site.xml (both client
>> *and*
>>>>> server)
>>>>> * increase hbase.client.ipc.pool.size in client's hbase-site.xml
>>>>> * enable short circuit reads (details depend on exact version of
>> Hadoop).
>>>>> Google will help :)
>>>>> 
>>>>> -- Lars
>>>>> 
>>>>> 
>>>>> ----- Original Message -----
>>>>> From: Vladimir Rodionov <vladrodionov@gmail.com>
>>>>> To: dev@hbase.apache.org
>>>>> Cc:
>>>>> Sent: Tuesday, July 30, 2013 1:30 PM
>>>>> Subject: Re: HBase read perfomnance and HBase client
>>>>> 
>>>>> This hbase.ipc.client.tcpnodelay (default - false) explains poor single
>>>>> thread performance and high latency ( 0.8ms in local network)?
>>>>> 
>>>>> 
>>>>> On Tue, Jul 30, 2013 at 1:22 PM, Vladimir Rodionov
>>>>> <vladrodionov@gmail.com>wrote:
>>>>> 
>>>>>> One more observation: One Configuration instance per HTable gives
50%
>>>>>> boost as compared to single Configuration object for all HTable's
-
>> from
>>>>>> 20K to 30K
>>>>>> 
>>>>>> 
>>>>>> On Tue, Jul 30, 2013 at 1:17 PM, Vladimir Rodionov <
>>>>> vladrodionov@gmail.com
>>>>>>> wrote:
>>>>>> 
>>>>>>> This thread dump has been taken when client was sending 60 requests
>> in
>>>>>>> parallel (at least, in theory). There are 50 server handler threads.
>>>>>>> 
>>>>>>> 
>>>>>>> On Tue, Jul 30, 2013 at 1:15 PM, Vladimir Rodionov <
>>>>>>> vladrodionov@gmail.com> wrote:
>>>>>>> 
>>>>>>>> Sure, here it is:
>>>>>>>> 
>>>>>>>> http://pastebin.com/8TjyrKRT
>>>>>>>> 
>>>>>>>> epoll is not only to read/write HDFS but to connect/listen
to
>> clients
>>>>> as
>>>>>>>> well?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tue, Jul 30, 2013 at 12:31 PM, Jean-Daniel Cryans <
>>>>>>>> jdcryans@apache.org> wrote:
>>>>>>>> 
>>>>>>>>> Can you show us what the thread dump looks like when
the threads
>> are
>>>>>>>>> BLOCKED? There aren't that many locks on the read path
when reading
>>>>>>>>> out of the block cache, and epoll would only happen if
you need to
>>>>> hit
>>>>>>>>> HDFS, which you're saying is not happening.
>>>>>>>>> 
>>>>>>>>> J-D
>>>>>>>>> 
>>>>>>>>> On Tue, Jul 30, 2013 at 12:16 PM, Vladimir Rodionov
>>>>>>>>> <vladrodionov@gmail.com> wrote:
>>>>>>>>>> I am hitting data in a block cache, of course. The
data set is
>> very
>>>>>>>>> small
>>>>>>>>>> to fit comfortably into block cache and all request
are directed
>> to
>>>>>>>>> the
>>>>>>>>>> same Region to guarantee single RS testing.
>>>>>>>>>> 
>>>>>>>>>> To Ted:
>>>>>>>>>> 
>>>>>>>>>> Yes, its CDH 4.3 . What the difference between 94.10
and 94.6 with
>>>>>>>>> respect
>>>>>>>>>> to read performance?
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Tue, Jul 30, 2013 at 12:06 PM, Jean-Daniel Cryans
<
>>>>>>>>> jdcryans@apache.org>wrote:
>>>>>>>>>> 
>>>>>>>>>>> That's a tough one.
>>>>>>>>>>> 
>>>>>>>>>>> One thing that comes to mind is socket reuse.
It used to come up
>>>>> more
>>>>>>>>>>> more often but this is an issue that people hit
when doing loads
>>>>> of
>>>>>>>>>>> random reads. Try enabling tcp_tw_recycle but
I'm not
>> guaranteeing
>>>>>>>>>>> anything :)
>>>>>>>>>>> 
>>>>>>>>>>> Also if you _just_ want to saturate something,
be it CPU or
>>>>> network,
>>>>>>>>>>> wouldn't it be better to hit data only in the
block cache? This
>>>>> way
>>>>>>>>> it
>>>>>>>>>>> has the lowest overhead?
>>>>>>>>>>> 
>>>>>>>>>>> Last thing I wanted to mention is that yes, the
client doesn't
>>>>> scale
>>>>>>>>>>> very well. I would suggest you give the asynchbase
client a run.
>>>>>>>>>>> 
>>>>>>>>>>> J-D
>>>>>>>>>>> 
>>>>>>>>>>> On Tue, Jul 30, 2013 at 11:23 AM, Vladimir Rodionov
>>>>>>>>>>> <vrodionov@carrieriq.com> wrote:
>>>>>>>>>>>> I have been doing quite extensive testing
of different read
>>>>>>>>> scenarios:
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. blockcache disabled/enabled
>>>>>>>>>>>> 2. data is local/remote (no good hdfs locality)
>>>>>>>>>>>> 
>>>>>>>>>>>> and it turned out that that I can not saturate
1 RS using one
>>>>>>>>>>> (comparable in CPU power and RAM) client host:
>>>>>>>>>>>> 
>>>>>>>>>>>> I am running client app with 60 read threads
active (with
>>>>>>>>> multi-get)
>>>>>>>>>>> that is going to one particular RS and
>>>>>>>>>>>> this RS's load is 100 -150% (out of 3200%
available) - it means
>>>>>>>>> that
>>>>>>>>>>> load is ~5%
>>>>>>>>>>>> 
>>>>>>>>>>>> All threads in RS are either in BLOCKED (wait)
or in IN_NATIVE
>>>>>>>>> states
>>>>>>>>>>> (epoll)
>>>>>>>>>>>> 
>>>>>>>>>>>> I attribute this  to the HBase client implementation
which seems
>>>>>>>>> to be
>>>>>>>>>>> not scalable (I am going dig into client later
on today).
>>>>>>>>>>>> 
>>>>>>>>>>>> Some numbers: The maximum what I could get
from Single get (60
>>>>>>>>> threads):
>>>>>>>>>>> 30K per sec. Multiget gives ~ 75K (60 threads)
>>>>>>>>>>>> 
>>>>>>>>>>>> What are my options? I want to measure the
limits and I do not
>>>>>>>>> want to
>>>>>>>>>>> run Cluster of clients against just ONE Region
Server?
>>>>>>>>>>>> 
>>>>>>>>>>>> RS config: 96GB RAM, 16(32) CPU
>>>>>>>>>>>> Client     : 48GB RAM   8 (16) CPU
>>>>>>>>>>>> 
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Vladimir Rodionov
>>>>>>>>>>>> Principal Platform Engineer
>>>>>>>>>>>> Carrier IQ, www.carrieriq.com
>>>>>>>>>>>> e-mail: vrodionov@carrieriq.com
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Confidentiality Notice:  The information
contained in this
>>>>> message,
>>>>>>>>>>> including any attachments hereto, may be confidential
and is
>>>>>>>>> intended to be
>>>>>>>>>>> read only by the individual or entity to whom
this message is
>>>>>>>>> addressed. If
>>>>>>>>>>> the reader of this message is not the intended
recipient or an
>>>>> agent
>>>>>>>>> or
>>>>>>>>>>> designee of the intended recipient, please note
that any review,
>>>>> use,
>>>>>>>>>>> disclosure or distribution of this message or
its attachments, in
>>>>>>>>> any form,
>>>>>>>>>>> is strictly prohibited.  If you have received
this message in
>>>>> error,
>>>>>>>>> please
>>>>>>>>>>> immediately notify the sender and/or
>> Notifications@carrieriq.comand
>>>>>>>>>>> delete or destroy any copy of this message and
its attachments.
>> 

Mime
View raw message