hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Question regarding scalability of regionservers
Date Wed, 17 Feb 2010 05:23:37 GMT
On Tue, Feb 16, 2010 at 7:28 PM, Brad McCarty <mccarbc@gmail.com> wrote:

> I read in another post that if one has a "hot" row in a table, meaning very heavy read
access to the same row, that the regionserver managing the region with that row can become
a single bottleneck.

If hot, it'll probably get stapled into cache.

> Is my understanding accurate?  If so, then assuming I can cache the data in the memstore,
will CPU utilization become the likely limiting resource on that regionserver?

Yes.  That should be the case.

Also, if I'm hitting the region server from many client servers
(Tomcat app servers), will the socket connection management overhead
on the regionserver overwhelm that server?

How many clients?  4 or 500 tomcat threads?

The way the ipc between hbase client and server works is that it keeps
up a single socket connection and multiplexes request/response over
this one connection.  This is how hadoop rpc works.

> If that's true, are there any other steps that can be taken to mitigate that risk, other
than buying bigger hardware?

This is hbase.  You don't buy bigger hardware, you just add nodes (smile).

The proper answer to your questions above is for you to give it a test
run.  Try setting up a cluster of about 5 hbase nodes and try a tomcat
server requesting playing a query log that resembles what you might
have in production.


View raw message