hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@gmail.com>
Subject Re: [DISCUSS] Make AsyncFSWAL the default WAL in 2.0
Date Thu, 12 May 2016 15:47:12 GMT
On Wed, May 11, 2016 at 10:28 PM, Andrew Purtell <andrew.purtell@gmail.com>
wrote:

> All you have to do is stick around long enough. Hadoop 0.20-append v2 :-)
>

*palm-all-the-faces*

> On May 11, 2016, at 9:46 PM, Stack <stack@duboce.net> wrote:
> >
> >> On Wed, May 11, 2016 at 7:53 PM, 张铎 <palomino219@gmail.com> wrote:
> >>
> >> I think at that time I will start a new project called AsyncDFSClient
> which
> >> will implement the whole client side logic of HDFS without using
> reflection
> >> :)
> > Haven't I seen this movie before? (smile)
> > St.Ack
> >
> >
> >
> >> 2016-05-12 10:27 GMT+08:00 Andrew Purtell <andrew.purtell@gmail.com>:
> >>
> >>> If Hadoop refuses the changes before we release, we can change the
> >> default
> >>> back.
> >>>
> >>>
> >>> On May 11, 2016, at 6:50 PM, Gary Helmling <ghelmling@gmail.com>
> wrote:
> >>>
> >>>>>
> >>>>>
> >>>>> I was trying to avoid the below oft-repeated pattern at least for
the
> >>> case
> >>>>> of critical developments:
> >>>>>
> >>>>> + New feature arrives after much work by developer, reviewers and
> >>> testers
> >>>>> accompanied by fanfare (blog, talks).
> >>>>> + Developers and reviewers move on after getting it committed or
it
> >> gets
> >>>>> hacked into a deploy so it works in a frankenstein form
> >>>>> + It sits in our code base across one or more releases marked as
> >>> optional,
> >>>>> 'experimental'
> >>>>> + The 'experimental' bleamish discourages its exercise by users
> >>>>> + The feature lags, rots
> >>>>> + Or, the odd time, we go ahead and enable it as default in spite
of
> >> the
> >>>>> fact it was never tried when experimental.
> >>>>>
> >>>>> Distributed Log Replay sat in hbase across a few major versions.
Only
> >>> when
> >>>>> the threat of our making an actual release with it on by default
did
> >> it
> >>> get
> >>>>> serious attention where it was found flawed and is now being actively
> >>>>> purged. This was after it made it past reviews, multiple attempts
at
> >>>>> testing at scale, and so on; i.e. we'd done it all by the book.
The
> >>> time in
> >>>>> an 'experimental' state added nothing.
> >>>> Those are all valid concerns as well. It's certainly a pattern that
> >> we've
> >>>> seen repeated. That's also a broader concern I have about the farther
> >> we
> >>>> push out 2.0, then the less exercised master is.
> >>>>
> >>>> I don't really know how best to balance this with concerns about user
> >>>> stability.  Enabling by default in master would certainly be a forcing
> >>>> function and would help it get more testing before release.  I hear
> >> that
> >>>> argument.  But I'm worried about the impact after release, where
> >>> something
> >>>> as simple as a bug-fix point release upgrade of Hadoop could result
in
> >>>> runtime breakage of an HBase install.  Will this happen in practice?
> I
> >>>> don't know.  It seems unlikely that the private variable names being
> >> used
> >>>> for example would change in a point release.  But we're violating the
> >>>> abstraction that Hadoop provides us which guarantees such breakage
> >> won't
> >>>> occur.
> >>>>
> >>>>
> >>>>>> Yes. 2.0 is a bit out there so we have some time to iron out
issues
> >> is
> >>>>> the
> >>>>> thought. Yes, it could push out delivery of 2.0.
> >>>> Having this on by default in an unreleased master doesn't actually
> >> worry
> >>> me
> >>>> that much.  It's just the question of what happens when we do release.
> >>> At
> >>>> that point, this discussion will be ancient history and I don't think
> >>> we'll
> >>>> give any renewed consideration to what the impact of this change might
> >>> be.
> >>>> Ideally it would be great to see this work in HDFS by that point and
> >> for
> >>>> that HDFS version this becomes a non-issue.
> >>>>
> >>>>
> >>>>>
> >>>>> I think the discussion here has been helpful. Holes have been found
> >> (and
> >>>>> plugged), the risk involved has gotten a good airing out here on
dev,
> >>> and
> >>>>> in spite of the back and forth, one of our experts in good standing
> is
> >>>>> still against it being on by default.
> >>>>>
> >>>>> If you are not down w/ the arguments, I'd be fine not making it
the
> >>>>> default.
> >>>>> St.Ack
> >>>>
> >>>> I don't think it's right to block this by myself, since I'm clearly
in
> >>> the
> >>>> minority.  Since others clearly support this change, have at it.
> >>>>
> >>>> But let me pose an alternate question: what if HDFS flat out refuses
> to
> >>>> adopt this change?  What are our options then with this already
> >> shipping
> >>> as
> >>>> a default?  Would we continue to endure breakage due to the use of
> HDFS
> >>>> private internals?  Do we switch the default back?  Do we do something
> >>> else?
> >>>>
> >>>> Thanks for the discussion.
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message