hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: VOTE: Merge HBASE-18070 "Enable memstore replication for meta replica" to master and then back to branch-2" (Was "HEAD-UP: Merging HBASE-18070 "Enable memstore replication for meta replica" to master and then back to branch-2")
Date Thu, 19 Nov 2020 05:42:56 GMT
On Wed, Nov 18, 2020 at 7:03 PM 张铎(Duo Zhang) <palomino219@gmail.com> wrote:

> OK, let me explain the technical part.
>
> What I proposed in the test is to verify that we could distribute the load
> across all the meta so we could benefit if the main replica is f**ked up.
> But then stack said this has already been solved by the old read replicas
> feature. Maybe in the first place I did not speak clearly enough but later
> I spoke clearly that I was talking about the distribution of the load for
> the meta table, but stack still does not agree and insist that I was
> talking about hedge read.
>
> For me, I do not think hedge read can fully solve the 'primary region
> f**ked up' problem. Of course we will go to secondary replicas if the
> primary can not respond, but it usually means the primary replica is not in
> a good state. The region server in a cluster will not go to the secondary
> replicas to read right? If the primary replica is unavailable, a failure of
> meta read could crash a region server. And it could also affect write
> requests to meta, which could cause serious problems on master too. I've
> implemented a lot of procedures on 2.x, usually we will just abort master
> if there is a failure when accessing meta. This means, in the old hedge
> read mode, if the primary replica has been f**ked up, the cluster will not
> be in a good state, finally the test will fail.
>
> And I think HBASE-18070 can solve the problem. But the main developer seems
> to have a different opinion on this. So I asked him what are his opinion on
> the 4 questions on jira, but so far I do not get a response from him yet.
>
> Why I do not want to write  the above explanation before is that, if I
> throw this out, the main developer could easily say that 'yes I agree with
> you, this is my point', to simply let the vote process to pass. But the
> actual issue will be covered as he never speaks out his own opinion, and
> may cause trouble in the future.
>
>
The veto seems to pivot on whether I, a co-author, knows what the feature I
co-designed and co-wrote does. He has posed a quiz for me to fill out that
I am to answer to his satisfaction even though my co-author has already
answered his questionnaire.

I suggest that the vote be on the feature rather than my responses to a
questionnaire of Duo's making.

S



> Thanks.
>
> Andrew Purtell <apurtell@apache.org> 于2020年11月19日周四 上午10:23写道:
>
> > That's not how a technical veto works. The burden to explain how the
> > contributors can fix the reason for the veto is on you. You need to give
> a
> > list of action items. "Fundamental of the issue" is just your opinion.
> > Nobody here is a Boss. Contributors don't have to satisfy your (nebulous)
> > requirements, you have to successfully argue your point.
> >
> > On Wed, Nov 18, 2020 at 6:10 PM 张铎(Duo Zhang) <palomino219@gmail.com>
> > wrote:
> >
> > > Thank you Andrew. I think my last comment clearly describe the two
> > > questions given by you.
> > >
> > > A clear and compelling reason why the proposed change is harmful or
> > > >    undesirable
> > >
> > >
> > > It is about the fundamental of this issue. Due to the back and forth on
> > how
> > > a test could used to verify the feature, I'm concerned whether the main
> > > developer has the same opinion on the problems we want to solve for
> this
> > > issue. This is a very critical problem, as if we can not even reach an
> > > agreement on what to solve, I do not think we should allow the merge of
> > the
> > > branch.
> > >
> > > One or more clear and specific action items which would allow the
> > > >    contributors to cure the reason for the veto
> > >
> > >
> > > This is also very very clear even before we started this vote thread? I
> > > asked 4 technical questions and waited for an answer, but seems the
> main
> > > developer refused to answer the questions and let me to read the design
> > doc
> > > of all the related issues. The design doc is not all written by him so
> I
> > do
> > > not think this is a constructive suggestion to solve the concerns here.
> > >
> > > Thanks.
> > >
> > > Sean Busbey <busbey@apache.org> 于2020年11月19日周四 上午4:27写道:
> > >
> > > > Pause a moment Huaxiang and give some time for the PMC to talk in
> > > > private a bit.
> > > >
> > > > On Wed, Nov 18, 2020 at 12:44 PM Huaxiang Sun <huaxiangsun@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > This vote passed 24 hours deadline. We got 5 +1s and 1 -1. What is
> > the
> > > > path
> > > > > to move forward? Anything we (as feature developers) can do to
> revert
> > > the
> > > > > -1?
> > > > >  As it blocks 2.4 release, I think we need a decision asap.
> > > > >
> > > > > Thanks,
> > > > > Huaxiang
> > > > >
> > > > > On Wed, Nov 18, 2020 at 8:46 AM Andrew Purtell <
> > > andrew.purtell@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Let me refer you to the Foundation guidance on voting:
> > > > > > https://www.apache.org/foundation/voting.html , and specifically
> > the
> > > > > > section on vetos:
> > > > > >
> > > > > > A code-modification proposal may be stopped dead in its tracks
> by a
> > > -1
> > > > vote
> > > > > > by a qualified voter. This constitutes a veto, and it cannot
be
> > > > overruled
> > > > > > nor overridden by anyone. Vetos stand until and unless withdrawn
> by
> > > > their
> > > > > > casters. To prevent vetos from being used capriciously, they
must
> > be
> > > > > > accompanied by a technical justification showing why the change
> is
> > > bad
> > > > > > (opens a security exposure, negatively affects performance,
> *etc.*
> > > ). A
> > > > > > veto without a justification is invalid and has no weight.
> > > > > > The Merriam-Webster dictionary defines 'capricious' as a sudden,
> > > > > > unpredictable, and impulsive act
> > > > > > <https://www.merriam-webster.com/dictionary/caprice>.
To guard
> > > against
> > > > > > this
> > > > > > kind of chaos in voting on technical matters, a technical veto
> must
> > > > have a
> > > > > > clear and compelling reason. Neither on the earlier thread nor
> the
> > > > JIRA is
> > > > > > a clear and compelling concern about the to-be-merged feature,
> > > clearly
> > > > > > communicated. A technical veto must also be accompanied with
> clear
> > > and
> > > > > > actionable feedback for the contributors, which in my view is
> also
> > > > absent.
> > > > > > A veto because one participant in the discussion does not
> > understand
> > > > the
> > > > > > change or its motivation, or simply expresses an opinion that
it
> is
> > > not
> > > > > > ideal and/or needed, is not a valid reason for a technical veto
> and
> > > > > > certainly does not provide actionable guidance for curing the
> veto.
> > > The
> > > > > > burden of the technical veto is not on the contributors to
> convince
> > > the
> > > > > > vetoing voter; the burden of proof is on the vetoing voter.
> > > > > >
> > > > > > In my view, as things stand the veto here is not yet valid but
> can
> > be
> > > > made
> > > > > > valid by offering the following:
> > > > > >
> > > > > >    - A clear and compelling reason why the proposed change is
> > harmful
> > > > or
> > > > > >    undesirable
> > > > > >    - One or more clear and specific action items which would
> allow
> > > the
> > > > > >    contributors to cure the reason for the veto
> > > > > >
> > > > > > Otherwise, the veto should be given no weight.
> > > > > >
> > > > > > To explain further my reason for concern, I have reviewed the
> > > > discussion
> > > > > > thread and JIRA in question here and the reason given for veto
> > seems
> > > > to me
> > > > > > a relatively minor technical matter that can easily be cured,
to
> > the
> > > > extent
> > > > > > it has been described (the reason is somewhat unclear), with
a
> > simple
> > > > and
> > > > > > straightforward follow up. There is no blocking functional,
> > > > performance,
> > > > > > regression, or security related reason. However we have a repeat
> > of a
> > > > > > pattern of disagreement related to a personal problem between
two
> > > > > > participants in the discussion, including the vetoing voter.
> > > > > >
> > > > > >
> > > > > > On Tue, Nov 17, 2020 at 8:03 PM Andrew Purtell <
> > > > andrew.purtell@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I am concerned this is not a valid technical veto and it’s
time
> > for
> > > > the
> > > > > > > PMC to take a more active role. This is poison to collaboration
> > and
> > > > it is
> > > > > > > affecting multiple people.
> > > > > > >
> > > > > > > > On Nov 17, 2020, at 5:43 PM, 张铎 <palomino219@gmail.com>
> wrote:
> > > > > > > >
> > > > > > > > Hi, bring my -1 from the HEAD-UP thread, this is
a veto.
> > > > > > > >
> > > > > > > > My concerns have not been fully resolved. Let's work
it out
> on
> > > > jira.
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > > >
> > > > > > > > clara xiong <clarax98007@gmail.com> 于2020年11月18日周三
上午1:51写道:
> > > > > > > >
> > > > > > > >> +1
> > > > > > > >>
> > > > > > > >>> On Tue, Nov 17, 2020 at 9:49 AM Huaxiang Sun
<
> > > > huaxiangsun@gmail.com>
> > > > > > > >>> wrote:
> > > > > > > >>>
> > > > > > > >>> +1
> > > > > > > >>>
> > > > > > > >>> On Tue, Nov 17, 2020 at 9:21 AM Bharath Vissapragada
<
> > > > > > > >> bharathv@apache.org>
> > > > > > > >>> wrote:
> > > > > > > >>>
> > > > > > > >>>> +1. Reviewed the design doc and the consolidated
patch,
> > great
> > > > > > > >>> improvement,
> > > > > > > >>>> thanks for putting this together.
> > > > > > > >>>>
> > > > > > > >>>> On Tue, Nov 17, 2020 at 9:09 AM Stack
<stack@duboce.net>
> > > wrote:
> > > > > > > >>>>
> > > > > > > >>>>> +1
> > > > > > > >>>>> S
> > > > > > > >>>>>
> > > > > > > >>>>> On Tue, Nov 17, 2020 at 8:43 AM Stack
<stack@duboce.net>
> > > > wrote:
> > > > > > > >>>>>
> > > > > > > >>>>>> Please VOTE on whether to merge
HBASE-18070 feature
> branch
> > > to
> > > > > > > >> master
> > > > > > > >>>> (and
> > > > > > > >>>>>> HBASE-18070.branch-2 to branch-2).
The VOTE runs for 24
> > > > hours. The
> > > > > > > >>>>> majority
> > > > > > > >>>>>> prevails (+ or -).
> > > > > > > >>>>>>
> > > > > > > >>>>>> Quoting the design lead-in:
> > > > > > > >>>>>>
> > > > > > > >>>>>> Read Replicas on the hbase:meta
Table currently only
> does
> > > > > > primitive
> > > > > > > >>>> read
> > > > > > > >>>>>> of the primary’s hfiles refreshing
every (configurable)
> N
> > > > seconds.
> > > > > > > >>> This
> > > > > > > >>>>>> issue is about making it so we
can do the Async WAL
> > > > Replication
> > > > > > > >>>>>> <
> http://hbase.apache.org/book.html#_asnyc_wal_replication
> > >
> > > > > > > >> ability,
> > > > > > > >>>>>> currently only available for user-space
Tables, against
> > the
> > > > > > > >>> hbase:meta
> > > > > > > >>>>>> system Tables too; i.e. the primary
replica pushes edits
> > to
> > > > its
> > > > > > > >>>> Replicas
> > > > > > > >>>>> so
> > > > > > > >>>>>> they run much closer to the primaries’
state. If clients
> > > > could be
> > > > > > > >>>>> satisfied
> > > > > > > >>>>>> reading from Replicas, then we
could have improved
> > > hbase:meta
> > > > > > > >> uptimes
> > > > > > > >>>> but
> > > > > > > >>>>>> also, we can distribute load off
of the primary and
> > > alleviate
> > > > > > > >>>> hbase:meta
> > > > > > > >>>>>> Table (read) hotspotting.
> > > > > > > >>>>>>
> > > > > > > >>>>>> Each PR that comprises the feature
branch has been
> > reviewed
> > > > before
> > > > > > > >>>>> commit.
> > > > > > > >>>>>>
> > > > > > > >>>>>> * For the design, see [2].
> > > > > > > >>>>>> * For an amalgamated PR of the
5 or 6 reviewed PRs that
> > > > comprise
> > > > > > > >>> this
> > > > > > > >>>>>> feature, see [3].
> > > > > > > >>>>>> * For a PE report that compared
performance before and
> > > after,
> > > > see
> > > > > > > >>>>>> HBASE-25127 (no regression).
> > > > > > > >>>>>> * A report on ITBLL runs is pending
to be attached to
> > > > HBASE-18070
> > > > > > > >>> but
> > > > > > > >>>>>> runs so far show no regression
with the feature enabled
> > > (ITBLL
> > > > > > runs
> > > > > > > >>>> were
> > > > > > > >>>>>> done against a backport of this
feature to branch-2 as
> the
> > > > ITBLL
> > > > > > > >>> state
> > > > > > > >>>> of
> > > > > > > >>>>>> master is currently an unknown).
> > > > > > > >>>>>>
> > > > > > > >>>>>> Testing continues mainly looking
for further improvement
> > and
> > > > to
> > > > > > > >>> better
> > > > > > > >>>>>> understand this feature in operation.
Documentation is
> > > > included.
> > > > > > > >>> There
> > > > > > > >>>>> are
> > > > > > > >>>>>> some follow-ons that have been
identified but these can
> > land
> > > > > > later.
> > > > > > > >>>>>>
> > > > > > > >>>>>> Thanks and thanks to all who contributed
to this
> feature;
> > > the
> > > > > > > >>> reviewers
> > > > > > > >>>>>> and the testers in particular.
> > > > > > > >>>>>>
> > > > > > > >>>>>> S
> > > > > > > >>>>>>
> > > > > > > >>>>>> 1.
> > http://hbase.apache.org/book.html#_asnyc_wal_replication
> > > > > > > >>>>>> 2.
> > > > > > > >>>>>>
> > > > > > > >>>>>
> > > > > > > >>>>
> > > > > > > >>>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1jJWVc-idHhhgL4KDRpjMsQJKCl_NRaCLGiH3Wqwd3O8/edit#
> > > > > > > >>>>>> This patch is currently missing
HBASE-25280, a bug found
> > in
> > > > > > > >> testing.
> > > > > > > >>>>>> 3. https://github.com/apache/hbase/pull/2643
> > > > > > > >>>>>>
> > > > > > > >>>>>
> > > > > > > >>>>
> > > > > > > >>>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best regards,
> > > > > > Andrew
> > > > > >
> > > > > > Words like orphans lost among the crosstalk, meaning torn from
> > > truth's
> > > > > > decrepit hands
> > > > > >    - A23, Crosstalk
> > > > > >
> > > >
> > >
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Words like orphans lost among the crosstalk, meaning torn from truth's
> > decrepit hands
> >    - A23, Crosstalk
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message