hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pradeep Gollakota <pradeep...@gmail.com>
Subject Re: Performance tuning
Date Sat, 21 Dec 2013 19:37:40 GMT
Do you know if machines 19-23 are on a different rack? It seems to me that
your problem might be a networking problem. Whether it is hardware,
configuration or something else entirely, I'm not sure. It might be
worthwhile to talk to your systems administrator to see why pings to these
machines are slow. What are the pings like from a bad RS to another bad RS?


On Sat, Dec 21, 2013 at 7:17 PM, Kristoffer Sjögren <stoffe@gmail.com>wrote:

> Hi
>
> I have been performance tuning HBase 0.94.6 running Phoenix 2.2.0 the last
> couple of days and need some help.
>
> Background.
>
> - 23 machine cluster, 32 cores, 4GB heap per RS.
> - Table t_24 have 24 online regions (24 salt buckets).
> - Table t_96 have 96 online regions (96 salt buckets).
> - 10.5 million rows per table.
> - Count query - select (*) from ...
> - Group by query - select A, B, C sum(D) from ... where (A = 1 and T >= 0
> and T <= 2147482800) group by A, B, C;
>
> What I found ultimately is that region servers 19, 20, 21, 22 and 23
> are consistently
> 2-3x slower than the others. This hurts overall latency pretty bad since
> queries are executed in parallel on the RS and then aggregated at the
> client (through Phoenix). In Hannibal regions spread out evenly over region
> servers, according to salt buckets (phoenix feature, pre-create regions and
> a rowkey prefix).
>
> As far as I can tell, there is no network or hardware configuration
> divergence between the machines. No CPU, network or other notable
> divergence
>  in Ganglia. No RS metric differences in HBase master console.
>
> The only thing that may be of interest is that pings (within the cluster)
> to
> bad RS is about 2-3x slower, around 0.050ms vs 0.130ms. Not sure if
> this is significant,
> but I get a bad feeling about it since it match exactly with the RS that
> stood out in my performance tests.
>
> Any ideas of how I might find the source of this problem?
>
> Cheers,
> -Kristoffer
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message