hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: HBase read perfomnance and HBase client
Date Thu, 01 Aug 2013 17:11:28 GMT
Yes, I think HBASE-9087 is what I have been observing in my load tests. It
seems that Store access (Scanner creation) is not only over-synchronized
but CPU intensive as well.


On Thu, Aug 1, 2013 at 9:27 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> Vlad:
> You might want to look at HBASE-9087 Handlers being blocked during reads
>
> On Thu, Aug 1, 2013 at 9:24 AM, Vladimir Rodionov <vladrodionov@gmail.com
> >wrote:
>
> > All tests I have run were hitting single region on a region server. I
> > suspect this is not a right scenario. There are some points in the Store
> > class which are heavily synchronized:
> >
> > For example this one:
> >   // All access must be synchronized.
> >   private final CopyOnWriteArraySet<ChangedReadersObserver>
> > changedReaderObservers =
> >     new CopyOnWriteArraySet<ChangedReadersObserver>();
> >
> > I will re-run tests against all available regions on a RS and will post
> > results later on today.
> >
> >
> >
> >
> > On Wed, Jul 31, 2013 at 11:15 PM, lars hofhansl <larsh@apache.org>
> wrote:
> >
> > > Yeah, that would seem to indicate that seeking into the block is not a
> > > bottleneck (and you said earlier that everything fits into the
> > blockcache).
> > > Need to profile to know more. If you have time, would be cool if you
> can
> > > start jvisualvm and attach it to the RS start the profiling and let the
> > > workload run for a bit.
> > >
> > > -- Lars
> > >
> > >
> > >
> > > ----- Original Message -----
> > > From: Vladimir Rodionov <vladrodionov@gmail.com>
> > > To: dev@hbase.apache.org; lars hofhansl <larsh@apache.org>
> > > Cc:
> > > Sent: Wednesday, July 31, 2013 9:57 PM
> > > Subject: Re: HBase read perfomnance and HBase client
> > >
> > > Smaller block size (32K) does not give any performance gain and this is
> > > strange, to say the least.
> > >
> > >
> > > On Wed, Jul 31, 2013 at 9:33 PM, lars hofhansl <larsh@apache.org>
> wrote:
> > >
> > > > Would be interesting to profile MultiGet. With RTT of 0.1ms, the
> > internal
> > > > RS friction is probably the main contributor.
> > > > In fact MultiGet just loops over the set at the RS and calls single
> > gets
> > > > on the various regions.
> > > >
> > > > Each Get needs to reseek into the block (even when it is cached,
> since
> > > KVs
> > > > have variable size).
> > > >
> > > > There are HBASE-6136 and HBASE-8362.
> > > >
> > > >
> > > > -- Lars
> > > >
> > > > ________________________________
> > > > From: Vladimir Rodionov <vladrodionov@gmail.com>
> > > > To: dev@hbase.apache.org; lars hofhansl <larsh@apache.org>
> > > > Sent: Wednesday, July 31, 2013 7:27 PM
> > > > Subject: Re: HBase read perfomnance and HBase client
> > > >
> > > >
> > > > Some final numbers :
> > > >
> > > > Test config:
> > > >
> > > > HBase 0.94.6
> > > > blockcache=true, block size = 64K, KV size = 62 bytes (raw).
> > > >
> > > > 5 Clients: 96GB, 16(32) CPUs (2.2Ghz), CentOS 5.7
> > > > 1 RS Server: the same config.
> > > >
> > > > Local network with ping between hosts: 0.1 ms
> > > >
> > > >
> > > > 1. HBase client hits the wall at ~ 50K per sec regardless of # of
> CPU,
> > > > threads, IO pool size and other settings.
> > > > 2. HBase server was able to sustain 170K per sec (with 64K block
> size).
> > > All
> > > > from block cache. KV size = 62 bytes (very small). This is for single
> > Get
> > > > op, 60 threads per client, 5 clients (on different hosts)
> > > > 3. Multi - get hits the wall at the same 170K-200K per sec. Batch
> size
> > > > tested: 30, 100. The same performance absolutely as with batch size =
> > 1.
> > > > Multi get has some internal issues on RegionServer side. May be
> > excessive
> > > > locking or some thing else.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Jul 30, 2013 at 2:01 PM, Vladimir Rodionov
> > > > <vladrodionov@gmail.com>wrote:
> > > >
> > > > > 1. SCR are enabled
> > > > > 2. Single Configuration for all table did not work well, but I will
> > try
> > > > it
> > > > > again
> > > > > 3. With Nagel I had 0.8ms avg, w/o - 0.4ms - I see the difference
> > > > >
> > > > >
> > > > > On Tue, Jul 30, 2013 at 1:50 PM, lars hofhansl <larsh@apache.org>
> > > wrote:
> > > > >
> > > > >> With Nagle's you'd see something around 40ms. You are not saying
> > 0.8ms
> > > > >> RTT is bad, right? Are you seeing ~40ms latencies?
> > > > >>
> > > > >> This thread has gotten confusing.
> > > > >>
> > > > >> I would try these:
> > > > >> * one Configuration for all tables. Or even use a single
> > > > >> HConnection/Threadpool and use the HTable(byte[], HConnection,
> > > > >> ExecutorService) constructor
> > > > >> * disable Nagle's: set both ipc.server.tcpnodelay and
> > > > >> hbase.ipc.client.tcpnodelay to true in hbase-site.xml (both client
> > > *and*
> > > > >> server)
> > > > >> * increase hbase.client.ipc.pool.size in client's hbase-site.xml
> > > > >> * enable short circuit reads (details depend on exact version
of
> > > > Hadoop).
> > > > >> Google will help :)
> > > > >>
> > > > >> -- Lars
> > > > >>
> > > > >>
> > > > >> ----- Original Message -----
> > > > >> From: Vladimir Rodionov <vladrodionov@gmail.com>
> > > > >> To: dev@hbase.apache.org
> > > > >> Cc:
> > > > >> Sent: Tuesday, July 30, 2013 1:30 PM
> > > > >> Subject: Re: HBase read perfomnance and HBase client
> > > > >>
> > > > >> This hbase.ipc.client.tcpnodelay (default - false) explains poor
> > > single
> > > > >> thread performance and high latency ( 0.8ms in local network)?
> > > > >>
> > > > >>
> > > > >> On Tue, Jul 30, 2013 at 1:22 PM, Vladimir Rodionov
> > > > >> <vladrodionov@gmail.com>wrote:
> > > > >>
> > > > >> > One more observation: One Configuration instance per HTable
> gives
> > > 50%
> > > > >> > boost as compared to single Configuration object for all
> HTable's
> > -
> > > > from
> > > > >> > 20K to 30K
> > > > >> >
> > > > >> >
> > > > >> > On Tue, Jul 30, 2013 at 1:17 PM, Vladimir Rodionov <
> > > > >> vladrodionov@gmail.com
> > > > >> > > wrote:
> > > > >> >
> > > > >> >> This thread dump has been taken when client was sending
60
> > requests
> > > > in
> > > > >> >> parallel (at least, in theory). There are 50 server
handler
> > > threads.
> > > > >> >>
> > > > >> >>
> > > > >> >> On Tue, Jul 30, 2013 at 1:15 PM, Vladimir Rodionov <
> > > > >> >> vladrodionov@gmail.com> wrote:
> > > > >> >>
> > > > >> >>> Sure, here it is:
> > > > >> >>>
> > > > >> >>> http://pastebin.com/8TjyrKRT
> > > > >> >>>
> > > > >> >>> epoll is not only to read/write HDFS but to connect/listen
to
> > > > clients
> > > > >> as
> > > > >> >>> well?
> > > > >> >>>
> > > > >> >>>
> > > > >> >>> On Tue, Jul 30, 2013 at 12:31 PM, Jean-Daniel Cryans
<
> > > > >> >>> jdcryans@apache.org> wrote:
> > > > >> >>>
> > > > >> >>>> Can you show us what the thread dump looks like
when the
> > threads
> > > > are
> > > > >> >>>> BLOCKED? There aren't that many locks on the
read path when
> > > reading
> > > > >> >>>> out of the block cache, and epoll would only
happen if you
> need
> > > to
> > > > >> hit
> > > > >> >>>> HDFS, which you're saying is not happening.
> > > > >> >>>>
> > > > >> >>>> J-D
> > > > >> >>>>
> > > > >> >>>> On Tue, Jul 30, 2013 at 12:16 PM, Vladimir Rodionov
> > > > >> >>>> <vladrodionov@gmail.com> wrote:
> > > > >> >>>> > I am hitting data in a block cache, of
course. The data set
> > is
> > > > very
> > > > >> >>>> small
> > > > >> >>>> > to fit comfortably into block cache and
all request are
> > > directed
> > > > to
> > > > >> >>>> the
> > > > >> >>>> > same Region to guarantee single RS testing.
> > > > >> >>>> >
> > > > >> >>>> > To Ted:
> > > > >> >>>> >
> > > > >> >>>> > Yes, its CDH 4.3 . What the difference
between 94.10 and
> 94.6
> > > > with
> > > > >> >>>> respect
> > > > >> >>>> > to read performance?
> > > > >> >>>> >
> > > > >> >>>> >
> > > > >> >>>> > On Tue, Jul 30, 2013 at 12:06 PM, Jean-Daniel
Cryans <
> > > > >> >>>> jdcryans@apache.org>wrote:
> > > > >> >>>> >
> > > > >> >>>> >> That's a tough one.
> > > > >> >>>> >>
> > > > >> >>>> >> One thing that comes to mind is socket
reuse. It used to
> > come
> > > up
> > > > >> more
> > > > >> >>>> >> more often but this is an issue that
people hit when doing
> > > loads
> > > > >> of
> > > > >> >>>> >> random reads. Try enabling tcp_tw_recycle
but I'm not
> > > > guaranteeing
> > > > >> >>>> >> anything :)
> > > > >> >>>> >>
> > > > >> >>>> >> Also if you _just_ want to saturate
something, be it CPU
> or
> > > > >> network,
> > > > >> >>>> >> wouldn't it be better to hit data only
in the block cache?
> > > This
> > > > >> way
> > > > >> >>>> it
> > > > >> >>>> >> has the lowest overhead?
> > > > >> >>>> >>
> > > > >> >>>> >> Last thing I wanted to mention is that
yes, the client
> > doesn't
> > > > >> scale
> > > > >> >>>> >> very well. I would suggest you give
the asynchbase client
> a
> > > run.
> > > > >> >>>> >>
> > > > >> >>>> >> J-D
> > > > >> >>>> >>
> > > > >> >>>> >> On Tue, Jul 30, 2013 at 11:23 AM, Vladimir
Rodionov
> > > > >> >>>> >> <vrodionov@carrieriq.com> wrote:
> > > > >> >>>> >> > I have been doing quite extensive
testing of different
> > read
> > > > >> >>>> scenarios:
> > > > >> >>>> >> >
> > > > >> >>>> >> > 1. blockcache disabled/enabled
> > > > >> >>>> >> > 2. data is local/remote (no good
hdfs locality)
> > > > >> >>>> >> >
> > > > >> >>>> >> > and it turned out that that I
can not saturate 1 RS
> using
> > > one
> > > > >> >>>> >> (comparable in CPU power and RAM) client
host:
> > > > >> >>>> >> >
> > > > >> >>>> >> >  I am running client app with
60 read threads active
> (with
> > > > >> >>>> multi-get)
> > > > >> >>>> >> that is going to one particular RS
and
> > > > >> >>>> >> > this RS's load is 100 -150% (out
of 3200% available) -
> it
> > > > means
> > > > >> >>>> that
> > > > >> >>>> >> load is ~5%
> > > > >> >>>> >> >
> > > > >> >>>> >> > All threads in RS are either in
BLOCKED (wait) or in
> > > IN_NATIVE
> > > > >> >>>> states
> > > > >> >>>> >> (epoll)
> > > > >> >>>> >> >
> > > > >> >>>> >> > I attribute this  to the HBase
client implementation
> which
> > > > seems
> > > > >> >>>> to be
> > > > >> >>>> >> not scalable (I am going dig into client
later on today).
> > > > >> >>>> >> >
> > > > >> >>>> >> > Some numbers: The maximum what
I could get from Single
> get
> > > (60
> > > > >> >>>> threads):
> > > > >> >>>> >> 30K per sec. Multiget gives ~ 75K (60
threads)
> > > > >> >>>> >> >
> > > > >> >>>> >> > What are my options? I want to
measure the limits and I
> do
> > > not
> > > > >> >>>> want to
> > > > >> >>>> >> run Cluster of clients against just
ONE Region Server?
> > > > >> >>>> >> >
> > > > >> >>>> >> > RS config: 96GB RAM, 16(32) CPU
> > > > >> >>>> >> > Client     : 48GB RAM   8 (16)
CPU
> > > > >> >>>> >> >
> > > > >> >>>> >> > Best regards,
> > > > >> >>>> >> > Vladimir Rodionov
> > > > >> >>>> >> > Principal Platform Engineer
> > > > >> >>>> >> > Carrier IQ, www.carrieriq.com
> > > > >> >>>> >> > e-mail: vrodionov@carrieriq.com
> > > > >> >>>> >> >
> > > > >> >>>> >> >
> > > > >> >>>> >> > Confidentiality Notice:  The information
contained in
> this
> > > > >> message,
> > > > >> >>>> >> including any attachments hereto, may
be confidential and
> is
> > > > >> >>>> intended to be
> > > > >> >>>> >> read only by the individual or entity
to whom this message
> > is
> > > > >> >>>> addressed. If
> > > > >> >>>> >> the reader of this message is not the
intended recipient
> or
> > an
> > > > >> agent
> > > > >> >>>> or
> > > > >> >>>> >> designee of the intended recipient,
please note that any
> > > review,
> > > > >> use,
> > > > >> >>>> >> disclosure or distribution of this
message or its
> > attachments,
> > > > in
> > > > >> >>>> any form,
> > > > >> >>>> >> is strictly prohibited.  If you have
received this message
> > in
> > > > >> error,
> > > > >> >>>> please
> > > > >> >>>> >> immediately notify the sender and/or
> > > > Notifications@carrieriq.comand
> > > > >> >>>> >> delete or destroy any copy of this
message and its
> > > attachments.
> > > > >> >>>> >>
> > > > >> >>>>
> > > > >> >>>
> > > > >> >>>
> > > > >> >>
> > > > >> >
> > > > >>
> > > > >>
> > > > >
> > > >
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message