hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Moving 2.0 forward
Date Mon, 15 May 2017 06:12:24 GMT
A month on. Status.

I've been working on the HBASE-14614 branch cluster testing. After a load
of fixing, the branch passes smaller test runs (an hour or so of ITBLL up
to 2B rows w/ killing monkeys). When I go larger, to a scale I've not done
in a while, I start to run into other interesting issues -- some of which
are related to AMv2 (I'm fixing), but others are not (100G WALs that take
ten minutes to split makes for interesting cascades when monkeys kill
inside the ten minutes...). I intend to keep on with this larger scale
testing since it is uncovering good stuff (especially when HDFS is dog slow
because of background replications) but my thinking is that I should be
large scale testing branch-2, not just HBASE-14614. I think HBASE-14614,
the new AMv2, is good enough to merge to master these times. Given it is
the last blocker, once in, I'll cut the hbase2 branch.

I'll start up a 'Merge HBASE-14614' DISCUSSION thread in the next day or so
(I need to fix some unit tests...).

The AMv2 doc is still a work in progress but should give a gist on where we
are currently[1].  There is a bunch of todo still but seems tractable; e.g.
rolling upgrade, finish doc., and we don't have an HBCK since it needs to
be recast in light of how stuff now works but a redo on HBCK is premature
given we don't know failure types as yet (we just fix the problems as they
come up).

St.Ack
1.
https://docs.google.com/document/d/1eVKa7FHdeoJ1-9o8yZcOTAQbv0u0bblBlCCzVSIn69g/edit#



On Thu, Apr 13, 2017 at 1:43 PM, Stack <stack@duboce.net> wrote:

> Some status:
>
> AMv2 (HBASE-14614) is near to passing all tests caveat my disabling of all
> to-do w/ fsck (fsck needs revamp) and tests that expect that they can move
> hbase;meta off master (AMv2 enforces this constraint; it is supposed to be
> enforced on AMv1 but meta-on-master is incompletely realized in AMv1 and
> AMv2). A few other tests have been disabled for various reasons. See [1]
> for full list.
>
> There is a hefty list of TODOs still (Again see the messy doc [1]) but the
> only 'blocker', IMO, is community confidence in AMv2. Currently, cluster
> tests with chaos fail (new form of 'stuck' regions). Takes time
> investigating.
>
> Will keep you all posted.
> St.Ack
>
>
>
> 1. https://docs.google.com/document/d/1eVKa7FHdeoJ1-
> 9o8yZcOTAQbv0u0bblBlCCzVSIn69g/edit#heading=h.92vclum0bvod
>
>
>
> On Fri, Mar 31, 2017 at 1:14 PM, Andrew Purtell <apurtell@apache.org>
> wrote:
>
>> +1 on branching (yay!)
>>
>> I have EC2 resources for running ITBLL etc.
>>
>>
>> On Thu, Mar 30, 2017 at 5:07 PM, Stack <stack@duboce.net> wrote:
>>
>> > Some notes on progress toward hbase2.
>> >
>> > Given that stability and performance are NOT emergent behaviors but
>> rather
>> > projects unto themselves, my thought is that we commit all that we've
>> > agreed as core for hbase2 (see [1]), branch, and then work on
>> stabilizing
>> > and perf rather than do stabilize, commit, and then branch. What this
>> means
>> > in practice is that for features like Inmemory Compaction, we commit it
>> > defaulted 'on' ("BASIC" mode) which is what we want in hbase2. Should it
>> > prove problematic under test, we disable it before release.
>> >
>> > Are folks good w/ this mode? I ask because, in a few issues there are
>> > requests for proof that a master feature is 'stable' before commit.
>> This is
>> > normally a healthy request only in master's case, it is hard to
>> demonstrate
>> > stability given its current state.
>> >
>> > Other outstanding issues such as decisions about whether master hosts
>> > system tables only (by default), I'm thinking, we can work out post
>> branch
>> > in alpha/betas before release.
>> >
>> > The awkward item is the long-pole Assignment Manager. This is an
>> > all-or-nothing affair. Here we are switching in a new Master core.
>> While I
>> > think it fine that AMv2 is incomplete come branch time, those of us
>> working
>> > on the new AM still need to demonstrate to you all that it basically
>> > viable.
>> >
>> > The point-of-no-return is commit of the patch in HBASE-14614.
>> HBASE-14614
>> > (AMv2) is coming close to passing all unit tests. We'll spend some time
>> > running it on a cluster to make sure it fundamentally sound and will
>> report
>> > back on our experience. There has been an ask for some dev doc and
>> > low-levels on how it works (in progress). Let satisfaction of these
>> > requests be blockers on commit. We'll put the HBASE-14614 commit up for
>> a
>> > vote on dev list given its import.
>> >
>> > Branch will happen after HBASE-14614 goes in (or its rejection) with our
>> > first alpha soon after. Its looking like a week or two at least given
>> how
>> > things have been going up to this.
>> >
>> > I intend to start in on hbase2 stability/perf projects after we branch.
>> >
>> > Interested in any thoughts you all might have on the above (Would also
>> > appreciate updates on state in [1] if you are a feature owner).
>> >
>> > Thanks,
>> > St.Ack
>> >
>> > 1. https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4
>> > z9iEu_ktczrlKHK8N4SZzs/edit#
>> >
>> >
>> >
>> > On Sat, Mar 11, 2017 at 5:32 PM, Josh Elser <elserj@apache.org> wrote:
>> >
>> > >
>> > > Stack wrote:
>> > >
>> > >> On Tue, Mar 7, 2017 at 1:51 PM, Josh Elser<elserj@apache.org>
>> wrote:
>> > >>
>> > >> Thanks for pulling in the FS Quotas work, Stack. I'm trying to cross
>> the
>> > >>> last T's and dot the last I's.
>> > >>>
>> > >>> The biggest thing I know I need to do still is to write a new
>> chapter
>> > to
>> > >>> the book. After that, I'd start entertaining larger
>> reviews/discussions
>> > >>> to
>> > >>> merge the feature into master. Anyone with free time (giggles)
is
>> more
>> > >>> than
>> > >>> welcome to start perusing :)
>> > >>>
>> > >>>
>> > >>> Out of interest, this could come in after 2.0 Josh? Any 2.0 specific
>> > >> needs
>> > >> to make this work?
>> > >>
>> > >> Meantime, updated the 2.0 doc 1.
>> > >>
>> > >> Thanks Josh,
>> > >> St.Ack
>> > >>
>> > >> 1.
>> > >> https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9i
>> > >> Eu_ktczrlKHK8N4SZzs/edit#
>> > >>
>> > >>
>> > > Nope, no need to block 2.0 on this one (given the other, related
>> > chatter).
>> > > Would be nice to get it in, but I completely understand if it slips :)
>> > >
>> > > Thanks for updating the doc for me!
>> > >
>> >
>>
>>
>>
>> --
>> Best regards,
>>
>>    - Andy
>>
>> If you are given a choice, you believe you have acted freely. - Raymond
>> Teller (via Peter Watts)
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message