hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon" <t...@cloudera.com>
Subject Review Request: Fix RPC deadlock when splitting regions on same RS as meta under heavy load
Date Tue, 07 Sep 2010 17:46:38 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/798/
-----------------------------------------------------------

Review request for hbase and stack.


Summary
-------

Moves all RPCs outside of the region writeLock - the writeLock is now only used long enough
to set the 'closing' flag. When we drop the lock any waiters will see 'closing' upon acquiring
the lock, and thus throw NSRE.

In the case that we abort the split, it will reopen the region as before. Accessors will have
gotten NSRE but will just come back to the same region eventually.


This addresses bug HBASE-2964.
    http://issues.apache.org/jira/browse/HBASE-2964


Diffs
-----

  src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java 3507c0d 

Diff: http://review.cloudera.org/r/798/diff


Testing
-------

YCSB testing on my cluster - it used to deadlock due to this bug within an hour. I ran a 5
hour load test overnight and it worked OK.


Thanks,

Todd


Mime
View raw message