hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Khaled Elmeleegy <kd...@hotmail.com>
Subject RE: HBase read performance
Date Thu, 02 Oct 2014 17:05:45 GMT
Thanks Lars for your quick reply.

Yes performance is similar with less handlers (I tried with 100 first).

The payload is not big ~1KB or so. The working set doesn't seem to fit in memory as there
are many cache misses. However, disk is far from being a bottleneck. I checked using iostat.
I also verified that neither the network nor the CPU of the region server or the client are
a bottleneck. This leads me to believe that likely this is a software bottleneck, possibly
due to a misconfiguration on my side. I just don't know how to debug it. A clear disconnect
I see is the individual request latency as reported by metrics on the region server (IPC processCallTime
vs scanNext) vs what's measured on the client. Does this sound right? Any ideas on how to
better debug it?

About this trick with the timestamps to be able to do a forward scan, thanks for pointing
it out. Actually, I am aware of it. The problem I have is, sometimes I want to get the key
after a particular timestamp and sometimes I want to get the key before, so just relying on
the key order doesn't work. Ideally, I want a reverse get(). I thought reverse scan can do
the trick though.


> Date: Thu, 2 Oct 2014 09:40:37 -0700
> From: larsh@apache.org
> Subject: Re: HBase read performance
> To: user@hbase.apache.org
> Hi Khaled,
> is it the same with fewer threads? 1500 handler threads seems to be a lot. Typically
a good number of threads depends on the hardware (number of cores, number of spindles, etc).
I cannot think of any type of scenario where more than 100 would give any improvement.
> How large is the payload per KV retrieved that way? If large (as in a few 100k) you definitely
want to lower the number of the handler threads.
> How much heap do you give the region server? Does the working set fit into the cache?
(i.e. in the metrics, do you see the eviction count going up, if so it does not fit into the
> If the working set does not fit into the cache (eviction count goes up) then HBase will
need to bring a new block in from disk on each Get (assuming the Gets are more or less random
as far as the server is concerned).
> In case you'll benefit from reducing the HFile block size (from 64k to 8k or even 4k).
> Lastly I don't think we tested the performance of using reverse scan this way, there
is probably room to optimize this.
> Can you restructure your keys to allow forwards scanning? For example you could store
the time as MAX_LONG-time. Or you could invert all the bits of the time portion of the key,
so that it sort the other way. Then you could do a forward scan.
> Let us know how it goes.
> -- Lars
> ----- Original Message -----
> From: Khaled Elmeleegy <kdiaa@hotmail.com>
> To: "user@hbase.apache.org" <user@hbase.apache.org>
> Cc:
> Sent: Thursday, October 2, 2014 12:12 AM
> Subject: HBase read performance
> Hi,
> I am trying to do a scatter/gather on hbase (, where I have a client reading
~1000 keys from an HBase table. These keys happen to fall on the same region server. For my
reads I use reverse scan to read each key as I want the key prior to a specific time stamp
(time stamps are stored in reverse order). I don't believe gets can accomplish that, right?
so I use scan, with caching set to 1.
> I use 2000 reader threads in the client and on HBase, I've set hbase.regionserver.handler.count
to 1500. With this setup, my scatter gather is very slow and can take up to 10s in total.
Timing an individual getScanner(..) call on the client side, it can easily take few hundreds
of ms. I also got the following metrics from the region server in question:
> "queueCallTime_mean" : 2.190855525775637,
> "queueCallTime_median" : 0.0,
> "queueCallTime_75th_percentile" : 0.0,
> "queueCallTime_95th_percentile" : 1.0,
> "queueCallTime_99th_percentile" : 556.9799999999818,
> "processCallTime_min" : 0,
> "processCallTime_max" : 12755,
> "processCallTime_mean" : 105.64873440912682,
> "processCallTime_median" : 0.0,
> "processCallTime_75th_percentile" : 2.0,
> "processCallTime_95th_percentile" : 7917.95,
> "processCallTime_99th_percentile" : 8876.89,
> "namespace_default_table_delta_region_87be70d7710f95c05cfcc90181d183b4_metric_scanNext_min"
: 89,
> "namespace_default_table_delta_region_87be70d7710f95c05cfcc90181d183b4_metric_scanNext_max"
: 11300,
> "namespace_default_table_delta_region_87be70d7710f95c05cfcc90181d183b4_metric_scanNext_mean"
: 654.4949739797315,
> "namespace_default_table_delta_region_87be70d7710f95c05cfcc90181d183b4_metric_scanNext_median"
: 101.0,
> "namespace_default_table_delta_region_87be70d7710f95c05cfcc90181d183b4_metric_scanNext_75th_percentile"
: 101.0,
> "namespace_default_table_delta_region_87be70d7710f95c05cfcc90181d183b4_metric_scanNext_95th_percentile"
: 101.0,
> "namespace_default_table_delta_region_87be70d7710f95c05cfcc90181d183b4_metric_scanNext_99th_percentile"
: 113.0,
> Where "delta" is the name of the table I am querying.
> In addition to all this, i monitored the hardware resources (CPU, disk, and network)
of both the client and the region server and nothing seems anywhere near saturation. So I
am puzzled by what's going on and where this time is going.
> Few things to note based on the above measurements: both medians of IPC processCallTime
and queueCallTime are basically zero (ms I presume, right?). However, scanNext_median is 101
(ms too, right?). I am not sure how this adds up. Also, even though the 101 figure seems outrageously
high and I don't know why, still all these scans should be happening in parallel, so the overall
call should finish fast, given that no hardware resource is contended, right? but this is
not what's happening, so I have to be missing something(s).
> So, any help is appreciated there.
> Thanks,
> Khaled
View raw message