hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: RS unresponsive after series of deletes
Date Thu, 14 Jun 2012 17:38:55 GMT
On Wed, Jun 13, 2012 at 12:09 PM, Ted Tuttle
<ted.tuttle@mentacapital.com> wrote:
> My client code has a set of deletes to carry out.  After successfully issuing 19 such
deletes the client begins logging HBase errors while trying to complete the deletes.  It
logs ERRORs every 60s for 10 times and then gives up.

What kind of a delete are you doing?  You are deleting individual
cells?  When you say 19 deletes, each of these is a batch delete?  If
a cell delete, we need to read the cell first to find the most recent
timestamp.  Looks like we are timing out the rpc doing your batch of
deletes.  Could it be that a batch is doing a bunch at the one time
and taking a long time to complete?  Try making smaller batches?
(Delete of 144 rows taking a minute seems like way too long though, or
is the delete of a row made up of many individual deletes?  A delete
of a column family on a row is cheaper than cell delete because just
puts a marker on the column family -- See

> Ultimately, the RS became responsive again. Looking at monitoring I see spike in CPU
utilization on node that is unresponsive; it goes from 2% utilization to 20% and sticks there
for a few minutes.  None of the other nodes in the cluster appear busy at this time.

Want to try thread dumping it when it goes unresponsive?  That'd help
us figure what the regionserver was doing at the time when its burning
20% (Do you have gc logging enabled?  Anything in the .out file at
this time when we are using CPU?)


View raw message