geode-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Juan José Ramos <jra...@pivotal.io>
Subject Re: Proposal to Include GEODE-7079 in 1.10.0
Date Mon, 19 Aug 2019 14:45:18 GMT
Thanks Owen!!.

On Mon, Aug 19, 2019 at 3:44 PM Owen Nichols <onichols@pivotal.io> wrote:

> There appears to be consensus that this is a critical fix.
>
> The following commit has been brought into release/1.10.0 <
> https://github.com/apache/geode/tree/release/1.10.0> as the critical fix
> for GEODE-7079 <https://issues.apache.org/jira/browse/GEODE-7079>:
>
> git cherry-pick -x 6f4bbbd96bcecdb82cf7753ce1dae9fa6baebf9b <
> https://github.com/apache/geode/commit/6f4bbbd96bcecdb82cf7753ce1dae9fa6baebf9b
> >
>
> GEODE-7079 <https://issues.apache.org/jira/browse/GEODE-7079> has been
> changed to 'resolved in' 1.10.0.
>
> -Owen
>
> > On Aug 19, 2019, at 2:39 AM, Juan José Ramos <jramos@pivotal.io> wrote:
> >
> > Hello team,
> >
> > Just following up on my proposal to include GEODE-7079 in release
> 1.10.0, I
> > think there's enough consensus at this point (+1: 7, +0: 1), can we
> cherry
> > pick commit *6f4bbbd [1]* into the release branch?.
> > Best regards.
> >
> >
> > [1]:
> >
> https://github.com/apache/geode/commit/6f4bbbd96bcecdb82cf7753ce1dae9fa6baebf9b
> >
> > On Thu, Aug 15, 2019 at 9:17 PM Udo Kohlmeyer <udo@apache.com> wrote:
> >
> >> I'm changing my vote to +1 on this issue.
> >>
> >> The ONLY reason I'm changing my vote is to add to the cleanliness of the
> >> code of the release. I do 100% disagree with the continual scope creep
> >> that we have been incurring on this release branch.
> >>
> >> --Udo
> >>
> >> On 8/15/19 12:34 PM, Dan Smith wrote:
> >>> +1 to merging Juan's fix for GEODE-7079. I've seen systems taken down
> by
> >>> rapidly filling up the logs in the past, this does seem to be a
> critical
> >>> fix from the perspective of system stability.
> >>>
> >>> Also, this is the change, which doesn't seem particularly risky to me.
> >>>
> >>> -          ConflationKey key = new
> >>> ConflationKey(gsEvent.getRegion().getFullPath(),
> >>> +          ConflationKey key = new
> ConflationKey(gsEvent.getRegionPath(),
> >>>
> >>> -Dan
> >>>
> >>> On Thu, Aug 15, 2019 at 12:23 PM Udo Kohlmeyer <udo@apache.com> wrote:
> >>>
> >>>> Whilst I agree with "*finish* when we believe the quality of the
> release
> >>>> branch is sufficient", I disagree that we should have cut a branch and
> >>>> continue to patch that branch with non-critical fixes. i.e this issue
> >>>> has been around for a while and has no averse side effects. Issues
> like
> >>>> GEODE-7081, which is new due to a new commit, AND it has critical
> >>>> stability implications on the server, that I can agree we should
> include
> >>>> in a potential release branch.
> >>>>
> >>>> Otherwise we can ALWAYS argue that said release branch is not of
> >>>> "sufficient" quality, especially if there are numerous existing JIRA's
> >>>> pertaining to bugs already in the system.
> >>>>
> >>>> To quote Juan's original email:
> >>>>
> >>>> /"Note: *no events are lost (even without the fix)* but, if the region
> >>>> takes//
> >>>> //a while to recover, the logs  for the member can grow pretty quickly
> >>>> due to//
> >>>> //the continuously thrown *NPEs.*"/
> >>>>
> >>>> In addition to this, if there is a commit in a cut release branch,
> which
> >>>> is requiring us to continuously patching the release branch, in order
> to
> >>>> stabilize that feature/fix, maybe we should consider reverting that
> fix
> >>>> and release it at a later stage, when it is believed that this fix is
> >>>> more stable and have better, more comprehensive test coverage.
> >>>>
> >>>> So far, GEODE-7081, does not have me convinced that it is critical.
OR
> >>>> maybe it is the latter of my options, where it is a stabilization
> commit
> >>>> to a new feature, which begs the question, should we have accepted the
> >>>> original feature commit if there are all manner of side effects which
> we
> >>>> are only discovering.
> >>>>
> >>>> --Udo
> >>>>
> >>>> On 8/15/19 11:08 AM, Anthony Baker wrote:
> >>>>> While we can’t fix *all known bugs*, I think where we do have
a fix
> for
> >>>> an important issue we should think hard about the cost of not
> including
> >>>> that in a release.
> >>>>> IMO, the fixed time approach to releases means that we *start* the
> >>>> release effort (including stabilization and bug fixing if needed) on
a
> >>>> known date and we *finish* when new believe the quality of the release
> >>>> branch is sufficient.  Given the number of important fixes being
> >> requested,
> >>>> I’m not sure we are there yet.
> >>>>> I think the release branch concept has merit because it allows us
to
> >>>> isolate ongoing work from the changes needed for a release.
> >>>>> +1 for including GEODE-7079.
> >>>>>
> >>>>> Anthony
> >>>>>
> >>>>>
> >>>>>> On Aug 15, 2019, at 10:51 AM, Udo Kohlmeyer <ukohlmeyer@gmail.com>
> >>>> wrote:
> >>>>>> Seems everyone is in favor or including a /*non-critical*/ fix
to an
> >>>> already cut branch of the a potential release...
> >>>>>> Am I missing something?
> >>>>>>
> >>>>>> Why cut a release at all... just have a perpetual cycle of fixes
> added
> >>>> to develop and users can chose what nightly snapshot build they would
> >> want
> >>>> to use..
> >>>>>> I'm voting -1 on a non-critical issue, which is existing and
worst
> >>>> effect is to fill logs will NPE logs... (yes, not something we want).
> >>>>>> I believed that we (as a Geode community) agreed that once a
release
> >>>> has been cut, only critical issue fixes will be included. If we
> continue
> >>>> just continually adding to the ALREADY CUT 1.10 release, where do we
> >> stop
> >>>> and when do we release...
> >>>>>> --Udo
> >>>>>>
> >>>>>> On 8/15/19 10:19 AM, Nabarun Nag wrote:
> >>>>>>> +1
> >>>>>>>
> >>>>>>> On Thu, Aug 15, 2019 at 10:15 AM Alexander Murmann <
> >>>> amurmann@apache.org>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> +1
> >>>>>>>>
> >>>>>>>> Agreed to fixing this. It's impossible for a user to
discover they
> >>>> hit an
> >>>>>>>> edge case that we fail to support till they are in prod
and
> restart.
> >>>>>>>>
> >>>>>>>> On Thu, Aug 15, 2019 at 10:09 AM Juan José Ramos <
> jramos@pivotal.io
> >>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hello Udo,
> >>>>>>>>>
> >>>>>>>>> Even if it is an existing issue I'd still consider
it critical
> for
> >>>> those
> >>>>>>>>> cases on which there are unprocessed events on the
persistent
> queue
> >>>>>>>> after a
> >>>>>>>>> restart and the region takes long to recover...
you can actually
> >> see
> >>>>>>>>> millions of *NPEs* flooding the member's logs.
> >>>>>>>>> My two cents anyway, it's up to the community to
make the final
> >>>> decision.
> >>>>>>>>> Cheers.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Thu, Aug 15, 2019 at 5:58 PM Udo Kohlmeyer <udo@apache.com>
> >>>> wrote:
> >>>>>>>>>> Juan,
> >>>>>>>>>>
> >>>>>>>>>>   From your explanation, it seems this issue
is existing and not
> >>>>>>>>>> critical. Could we possibly hold this for 1.11?
> >>>>>>>>>>
> >>>>>>>>>> --Udo
> >>>>>>>>>>
> >>>>>>>>>> On 8/15/19 5:29 AM, Ju@N wrote:
> >>>>>>>>>>> Hello team,
> >>>>>>>>>>>
> >>>>>>>>>>> I'd like to propose including the *fix [1]*
for *GEODE-7079
> [2]*
> >> in
> >>>>>>>>>> release
> >>>>>>>>>>> 1.10.0.
> >>>>>>>>>>> Long story short: a *NullPointerException*
can be continuously
> >>>> thrown
> >>>>>>>>>>> and flood the member's logs if a serial
event processor (either
> >>>>>>>>>>> *async-event-queue* or *gateway-sender*)
starts processing
> events
> >>>>>>>> from
> >>>>>>>>> a
> >>>>>>>>>>> recovered persistent queue before the actual
region to which it
> >> was
> >>>>>>>>>>> attached is fully operational.
> >>>>>>>>>>> Note: *no events are lost (even without
the fix)* but, if the
> >>>> region
> >>>>>>>>>> takes
> >>>>>>>>>>> a while to recover, the logs  for the member
can grow pretty
> >>>> quickly
> >>>>>>>>> due
> >>>>>>>>>> to
> >>>>>>>>>>> the continuously thrown *NPEs.*
> >>>>>>>>>>> Best regards.
> >>>>>>>>>>>
> >>>>>>>>>>> [1]:
> >>>>>>>>>>>
> >>>>
> >>
> https://github.com/apache/geode/commit/6f4bbbd96bcecdb82cf7753ce1dae9fa6baebf9b
> >>>>>>>>>>> [2]: https://issues.apache.org/jira/browse/GEODE-7079
> >>>>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Juan José Ramos Cassella
> >>>>>>>>> Senior Software Engineer
> >>>>>>>>> Email: jramos@pivotal.io
> >>>>>>>>>
> >>
> >
> >
> > --
> > Juan José Ramos Cassella
> > Senior Software Engineer
> > Email: jramos@pivotal.io
>
>

-- 
Juan José Ramos Cassella
Senior Software Engineer
Email: jramos@pivotal.io

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message