hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vrodio...@carrieriq.com>
Subject RE: Linear Scalability in HBase
Date Fri, 25 Oct 2013 18:21:21 GMT
You can not saturate region server with one client (unless you probably use hbase-async) if
all data is cached in RAM.
In our performance tests we have run 10 clients (on different hosts) with 30 threads each
to max out 1 RS when all data
is in cache (block, page, etc).

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodionov@carrieriq.com

________________________________________
From: Ramu M S [ramu.malur@gmail.com]
Sent: Friday, October 25, 2013 9:35 AM
To: user@hbase.apache.org
Subject: Re: Linear Scalability in HBase

Hi,

For me scalability is to achieve same throughput and latency with the
increase in number of clients.

In my case the data set increases with the number of clients. That's the
reason I vary both clients and region servers.

I'm trying to identify how the cluster should grow to handle data from more
clients so that the operations throughput and latency is under defined
limits.

Currently the limit is 15K OPS throughput and 1 ms latency.

To test, I have kept the data increase at around 15 million per server.

Each YCSB client actually runs 32 threads. So it is actually 15 million
more data for 32 more clients.

All machines are physical servers.

1) Read and write latency is around 1 ms in first whereas in second case
its little higher at 1.1 to 1.2 ms.

2) Keeping same number of clients as the first case, the latency reduced to
0.7 ms but throughput came down further to just 9K OPS

For the tests, I'm running both clients and Region servers on same machine.
But I tried in 8 Server scenario to run clients on different machines but
results were almost same as that of running clients on same machine.

Ganglia shows that system load is around 30% in both scenarios.

What I wanted to understand is how to grow the cluster to meet the needs of
both throughput and latency?

Regards,
Ramu

Confidentiality Notice:  The information contained in this message, including any attachments
hereto, may be confidential and is intended to be read only by the individual or entity to
whom this message is addressed. If the reader of this message is not the intended recipient
or an agent or designee of the intended recipient, please note that any review, use, disclosure
or distribution of this message or its attachments, in any form, is strictly prohibited. 
If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.com
and delete or destroy any copy of this message and its attachments.

Mime
View raw message