hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Heads-up: big commit in next day or so; "HBASE-2692 Master rewrite and cleanup for 0.90"
Date Tue, 31 Aug 2010 07:43:44 GMT
I just posted the patch to https://review.cloudera.org/r/750/.  Its a
little on the large size (1.5MB. Sorry about that).

The bulk of the patch is by Karthik Ranganathan and Jon Gray.  They've
been working on it in the 0.90_master_rewrite branch with a good few
months now.  Its been reviewed pretty extensively, multiple times, but
its too big for any one individual to review in anything but a cursory
manner in its current form (Again, sorry about that).  Piece-mealing
the changes into the code base was tried but getting all of the
stepped changes in was going to take eons to complete and when we
tried it, it wasn't working well anyways -- reviewers had a hard time
getting their heads around partial feature implementations and
groundwork baffled when the superstructure wasn't coming till a later

This patch addresses issues head on that have plagued us for what
seems like ages now --- troublesome assignment of regions in
particular -- and IMO in spite of its size and lack of review, unless
objection, I'm going to go ahead and commit this patch tomorrow or the
day after, after all tests pass.  We could let this monster stew out
on the branch for another couple or weeks or a month but IMO, lts
mature enough to be added to TRUNK so we can all work on the
stabilization that will get us to 0.90.0 Release Candidate.

See the umbrella issue for all thats addressed -- about 11 or 12
issues in all, a few of them blockers -- but here is a synopsis of
what the patch includes:

+ Region in transition data structure is now kept out in zookeeper to
facilitate master failover and to do away with race conditions that
used result in double assignment of regions
+ Open and close of regions as well as server shutdown handling and
table transitions are now done in Executors; config. says how much
parallellism to run with.  Default is 3 openers, 3 closers, with
designated handlers for meta and root opening, etc. (We used to be
single-threaded in master and regionserver doing opens/closes, etc.)
+ New load balancer; features include figuring out the plan on startup
and then assigning out all regions in the one assignment.  New method
in admin tool allows you unload region from one server and assign it
to another explicit server.
+ Most of what passed over the heartbeating mechanism has now moved to
go via zk or the master directly invokes rpc to close/open regions
rather than wait on heartbeat to come around

There is more including a bunch of cleanup and refactorings that in
particular facilitate testing, and this patch lays the ground work for
new features coming down the pipeline (the same executor/handler
mechanism will get us parallel flushing, splitting and compacting).

Things will look different after this patch goes in, there are lots of
zk transitions in logs now, and this patch is going to drum up new
kinds of bugs but after a week of gung-ho bug bashing we should have
ourselves a more robust hbase.


View raw message