hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: HBase Thrift inserts bottlenecked somewhere -- but where?
Date Fri, 01 Mar 2013 17:33:50 GMT
The primary unit of load distribution in HBase is the region, make
sure you have more than one. This is well documented in the manual


On Fri, Mar 1, 2013 at 4:17 AM, Dan Crosta <dan@magnetic.com> wrote:
> We are using a 6-node HBase cluster with a Thrift Server on each of the RegionServer
nodes, and trying to evaluate maximum write throughput for our use case (which involves many
processes sending mutateRowsTs commands). Somewhere between about 30 and 40 processes writing
into the system we cross the threshold where adding additional writers yields only very limited
returns to throughput, and I'm not sure why. We see that the CPU and Disk on the DataNode/RegionServer/ThriftServer
machines are not saturated, nor is the NIC in those machines. I'm a little unsure where to
look next.
> A little more detail about our deployment:
> * CDH 4.1.2
> * DataNode/RegionServer/ThriftServer class: EC2 m1.xlarge
> ** RegionServer: 8GB heap
> ** ThriftServer: 1GB heap
> ** DataNode: 4GB heap
> ** EC2 ephemeral (i.e. local, not EBS) volumes used for HDFS
> If there's any other information that I can provide, or any other configuration or system
settings I should look at, I'd appreciate the pointers.
> Thanks,
>  - Dan

View raw message