aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Sirois <jsir...@apache.org>
Subject Re: 0.12.0 RC status
Date Fri, 05 Feb 2016 18:42:55 GMT
On Fri, Feb 5, 2016 at 10:23 AM, Maxim Khutornenko <maxim@apache.org> wrote:

> Just to set expectations straight on my side, I won't be able to spend
> time on this until next week. I am planning to do a live cluster restore to
> better understand and document all findings.
>
> To reiterate what I mentioned above, I don't think this should be a
> blocker for the release. The restore-from-backup is a very environment
> sensitive procedure and our instructions should be treated as general
> guidance rather than a precise set of steps to follow.
>

OK - thanks for the feedback.
I am not comfortable with that status for the docs personally, but if
that's the status quo I see no reason to block 0.12.0 on the docs and will
proceed to prep rc3 ~presently.

I do think though that we should block 0.13.0 on docs that actually can be
used.  My findings did not indicate a simple different environments will be
different issue, but a much more fundamental issue!
My reasoning on an 0.13.0 block is that one of Aurora's selling points /
differentiating points seems to me has been its robustness for production
operations, and useable recovery procedures or tools or both seem to me to
be important to provide to support this selling point.


>
> On Fri, Feb 5, 2016 at 8:10 AM, John Sirois <jsirois@apache.org> wrote:
>
>> On Wed, Feb 3, 2016 at 10:58 AM, John Sirois <jsirois@apache.org> wrote:
>>
>> >
>> >
>> > On Tue, Feb 2, 2016 at 10:22 AM, Maxim Khutornenko <maxim@apache.org>
>> > wrote:
>> >
>> >> +1 to having 1603 and 1601 as blockers. I am planning to work on 1603
>> >> today.
>> >>
>> >> As for 1605, I don't believe it's a blocker given that all findings are
>> >> already documented in the ticket.
>> >>
>> >
>> > I went through a recovery using the guide and hit issues that don't
>> square
>> > with the description of corrections described in AURORA-1605 nor the new
>> > `--bypass-leader-redirect` capability introduced to aurora_admin in
>> > AURORA-1601.
>> > I suspect this can be explained by me not knowing what I'm doing!  That
>> > said, unless I'm being especially dumb here, neither will the the 1st
>> time
>> > restorer.
>> >
>> > I'll wait for you to close out AURORA-1603 to signal an OK on the
>> > technical issue that necessitated the restore in the 1st place and I'd
>> like
>> > to block on some feedback on my experience restoring documented in
>> > AURORA-1605 before making up my mind on AURORA-1605 being a release
>> > blocker.  It does seem to me we should have useable restore docs as a
>> high
>> > priority, but if they've been broken in large ways for some time, I
>> might
>> > be convinced that AURORA-1605 is a valid 0.13.0 release blocker but not
>> > 0.12.0.
>> >
>>
>> Alright - Maxim has closed out AURORA-1603 and only AURORA-1605 remains.
>> I'd still like to block on that if someone can devote some time in the
>> next
>> 2 business days to running through the docs and correcting / reviewing the
>> issues I had with the docs as noted in the issue.
>> If I have no feedback on the status of AURORA-1605 by the morning (MST) of
>> Monday February 8th, I'll take that a silent disapproval of the block and
>> proceed to cut 0.12.0-rc3.
>>
>>
>> >
>> >> On Tue, Feb 2, 2016 at 7:03 AM, Joshua Cohen <jcohen@apache.org>
>> wrote:
>> >>
>> >> > I'd only consider item 1 to be a blocker to 0.12.0, but 2 and 3
>> should
>> >> be
>> >> > relatively quick so in general this sounds like a reasonable plan of
>> >> action
>> >> > to me.
>> >> >
>> >> > On Tue, Feb 2, 2016 at 8:52 AM, John Sirois <jsirois@apache.org>
>> wrote:
>> >> >
>> >> > > Although the last blocker raised for the 0.12.0 RC series has
been
>> >> > resolved
>> >> > > [1], it looks like resolution of several issues related to rolling
>> >> back
>> >> > to
>> >> > > 0.11.0 are required to cut the next RC:
>> >> > > 1. "Scheduler fails to start after rollback":
>> >> > > https://issues.apache.org/jira/browse/AURORA-1603
>> >> > > 2. "Add a flag to disable the HTTP redirect to the leader":
>> >> > > https://issues.apache.org/jira/browse/AURORA-1601
>> >> > > 3. "Update recovery docs to reflect changes":
>> >> > > https://issues.apache.org/jira/browse/AURORA-1605
>> >> > >
>> >> > > These issues fall into 2 classes:
>> >> > > Item 1 above needs to fix the immediate problem of rolling back
to
>> >> > 0.11.0;
>> >> > > although there may be more changes to process, tooling and code
to
>> >> > support
>> >> > > the problem better going forward.
>> >> > > Items 2 & 3 address tooling & procedure that support rollback.
>> >> > >
>> >> > > It looks like Maxim has claimed item 1/AURORA-1603 and Joshua
is
>> >> working
>> >> > > item 2/AURORA-1601.  I assume one of Maxim, Joshua or Zameer will
>> >> tackle
>> >> > > item 3/AURORA-1605 to update rollback docs with what they learned
>> >> rolling
>> >> > > back.
>> >> > >
>> >> > > If I have any of this wrong, please speak up; otherwise I'll be
>> >> cutting
>> >> > the
>> >> > > next 0.12.0 RC3 when the above 3 issues are resolved.
>> >> > >
>> >> > > [1] "Identity.role is still used in the UI leading to duplicate
>> >> instances
>> >> > > on job page": https://issues.apache.org/jira/browse/AURORA-1604
>> >> > >
>> >> >
>> >>
>> >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message