hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brad McCarty <mcca...@gmail.com>
Subject Question regarding scalability of regionservers
Date Wed, 17 Feb 2010 03:28:49 GMT
Hello all,  we're looking at using HBase for the backend datastore for a large-scale site where
many Tomcat app servers would access HBase in realtime.  Our data access pattern is not completely
random, we would have to access some common rows from many app servers.

I read in another post that if one has a "hot" row in a table, meaning very heavy read access
to the same row, that the regionserver managing the region with that row can become a single
bottleneck.  

Is my understanding accurate?  If so, then assuming I can cache the data in the memstore,
will CPU utilization become the likely limiting resource on that regionserver?  Also, if I'm
hitting the region server from many client servers (Tomcat app servers), will the socket connection
management overhead on the regionserver overwhelm that server?

If that's true, are there any other steps that can be taken to mitigate that risk, other than
buying bigger hardware?

Thanks very much,
Brad McCarty
Mime
View raw message