hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Heads-up: big commit in next day or so; "HBASE-2692 Master rewrite and cleanup for 0.90"
Date Wed, 01 Sep 2010 03:22:43 GMT
More performance:

+ Splits should run faster now the daughters are put up immediately on
the parents' hosting server rather than later after a message to the
master and after master assigns the daughter regions out for opening
(The load balancer will rebalance off the parent's host later if
needed)
+ Smarter load balancer that is parsimonious and smarter about when to
move regions

The above should make it so regions are offlined for shorter periods of time

St.Ack


On Tue, Aug 31, 2010 at 4:47 PM, Jonathan Gray <jgray@facebook.com> wrote:
> @Ted, what Ryan said.  Please don't keep asking for performance numbers after each change.
 We are spending effort writing code and testing for correctness.  If numbers are available,
we will not hide them.  Otherwise, it would be awesome if you wanted to lead an effort to
do ongoing performance tests.
>
> As far as what could have a performance impact...
>
> - Cluster startup can be drastically faster and the ability to not lose data locality
across restarts will be fairly trivial after this goes in
> - Enable/disable should be significantly faster
> - Region assignment should be faster than current trunk though addition of ZK does add
latency compared with an RPC-only design
> - Multi-threading and priority abstractions added.  Already done for things like open/close,
next up is flush/split/compact
> - Removal of BaseScanner means we do not ever need to wait for a meta scan to trigger
any master operations
> - Removal of heartbeat piggybacking means we do not ever need to wait for a heartbeat
to send an RS a message
>
> Other stuff like admin functions going straight to RS will open up the ability for us
to make things that are only async today work in either a sync or async fashion.
>
> Lastly, we are moving away from things like RetryableMetaOperations which use the combination
of maxRetries and delay when META is not available.  Now this is strictly set as a maxTimeout.
>
> JG
>
>
>> -----Original Message-----
>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>> Sent: Tuesday, August 31, 2010 3:36 PM
>> To: dev@hbase.apache.org
>> Subject: Re: Heads-up: big commit in next day or so; "HBASE-2692 Master
>> rewrite and cleanup for 0.90"
>>
>> There might not be straight line performance, but there are features
>> to be enabled, and also some things that are sped up, like region
>> assignment, don't show in many standard performance tests (eg: ycsb).
>>
>> If you are serious, maybe you could help by running performance tests?
>>  Running performance tests is not an easy thing, and can occupy a
>> senior engineer an entire day running a series of tests just to
>> produce 1 spreadsheet.  In reality it's performance #s vs working
>> code.  I think you know the one we pick.
>>
>> -ryan
>>
>> On Tue, Aug 31, 2010 at 2:29 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>> > Jonathan:
>> > Can you publish performance metric (compared with current trunk) from
>> > cluster running the new master ?
>> >
>> > Thanks
>> >
>> > On Tue, Aug 31, 2010 at 10:20 AM, Jonathan Gray <jgray@facebook.com>
>> wrote:
>> >
>> >> Though I'm sure my vote is clear, I'm +1 on this.
>> >>
>> >> The plan at fb is to update our internal branch to (almost) the
>> current
>> >> head of trunk, before the commit of the master branch.  Ongoing
>> testing will
>> >> continue on this branch.
>> >>
>> >> In parallel, testing will also begin here on the new master
>> following the
>> >> mega commit.
>> >>
>> >> Hopefully we can transition everything to the new master sooner than
>> later
>> >> instead of splitting time.  I'd say shortly after initial testing is
>> >> complete we should push for a new master 0.89 or 0.90RC and ask
>> users to
>> >> test as much as possible.
>> >>
>> >>
>> >> I did as much as possible to try and get reviews along the way,
>> including
>> >> several very early design discussions and group code review
>> sessions, but
>> >> this is pretty radical change so has not been easy.  If you're
>> familiar with
>> >> the old BaseScanner, RegionManager, ZooKeeperWrapper, etc. this
>> stuff has
>> >> been cut and the replacements are much shorter/simpler.
>> >>
>> >> Just need to find all the bugs and fill in the oversights :)
>> >>
>> >> Stack, thanks for carrying this thing over the finish line.
>> >>
>> >> JG
>> >>
>> >> > -----Original Message-----
>> >> > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>> Of
>> >> > Stack
>> >> > Sent: Tuesday, August 31, 2010 12:44 AM
>> >> > To: HBase Dev List
>> >> > Subject: Heads-up: big commit in next day or so; "HBASE-2692
>> Master
>> >> > rewrite and cleanup for 0.90"
>> >> >
>> >> > I just posted the patch to https://review.cloudera.org/r/750/.
>>  Its a
>> >> > little on the large size (1.5MB. Sorry about that).
>> >> >
>> >> > The bulk of the patch is by Karthik Ranganathan and Jon Gray.
>>  They've
>> >> > been working on it in the 0.90_master_rewrite branch with a good
>> few
>> >> > months now.  Its been reviewed pretty extensively, multiple times,
>> but
>> >> > its too big for any one individual to review in anything but a
>> cursory
>> >> > manner in its current form (Again, sorry about that).  Piece-
>> mealing
>> >> > the changes into the code base was tried but getting all of the
>> >> > stepped changes in was going to take eons to complete and when we
>> >> > tried it, it wasn't working well anyways -- reviewers had a hard
>> time
>> >> > getting their heads around partial feature implementations and
>> >> > groundwork baffled when the superstructure wasn't coming till a
>> later
>> >> > stage.
>> >> >
>> >> > This patch addresses issues head on that have plagued us for what
>> >> > seems like ages now --- troublesome assignment of regions in
>> >> > particular -- and IMO in spite of its size and lack of review,
>> unless
>> >> > objection, I'm going to go ahead and commit this patch tomorrow or
>> the
>> >> > day after, after all tests pass.  We could let this monster stew
>> out
>> >> > on the branch for another couple or weeks or a month but IMO, lts
>> >> > mature enough to be added to TRUNK so we can all work on the
>> >> > stabilization that will get us to 0.90.0 Release Candidate.
>> >> >
>> >> > See the umbrella issue for all thats addressed -- about 11 or 12
>> >> > issues in all, a few of them blockers -- but here is a synopsis of
>> >> > what the patch includes:
>> >> >
>> >> > + Region in transition data structure is now kept out in zookeeper
>> to
>> >> > facilitate master failover and to do away with race conditions
>> that
>> >> > used result in double assignment of regions
>> >> > + Open and close of regions as well as server shutdown handling
>> and
>> >> > table transitions are now done in Executors; config. says how much
>> >> > parallellism to run with.  Default is 3 openers, 3 closers, with
>> >> > designated handlers for meta and root opening, etc. (We used to be
>> >> > single-threaded in master and regionserver doing opens/closes,
>> etc.)
>> >> > + New load balancer; features include figuring out the plan on
>> startup
>> >> > and then assigning out all regions in the one assignment.  New
>> method
>> >> > in admin tool allows you unload region from one server and assign
>> it
>> >> > to another explicit server.
>> >> > + Most of what passed over the heartbeating mechanism has now
>> moved to
>> >> > go via zk or the master directly invokes rpc to close/open regions
>> >> > rather than wait on heartbeat to come around
>> >> >
>> >> > There is more including a bunch of cleanup and refactorings that
>> in
>> >> > particular facilitate testing, and this patch lays the ground work
>> for
>> >> > new features coming down the pipeline (the same executor/handler
>> >> > mechanism will get us parallel flushing, splitting and
>> compacting).
>> >> >
>> >> > Things will look different after this patch goes in, there are
>> lots of
>> >> > zk transitions in logs now, and this patch is going to drum up new
>> >> > kinds of bugs but after a week of gung-ho bug bashing we should
>> have
>> >> > ourselves a more robust hbase.
>> >> >
>> >> > St.Ack
>> >>
>> >
>

Mime
View raw message