geode-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Udo Kohlmeyer <>
Subject Re: Proposal to Include GEODE-7079 in 1.10.0
Date Thu, 15 Aug 2019 20:17:53 GMT
I'm changing my vote to +1 on this issue.

The ONLY reason I'm changing my vote is to add to the cleanliness of the 
code of the release. I do 100% disagree with the continual scope creep 
that we have been incurring on this release branch.


On 8/15/19 12:34 PM, Dan Smith wrote:
> +1 to merging Juan's fix for GEODE-7079. I've seen systems taken down by
> rapidly filling up the logs in the past, this does seem to be a critical
> fix from the perspective of system stability.
> Also, this is the change, which doesn't seem particularly risky to me.
> -          ConflationKey key = new
> ConflationKey(gsEvent.getRegion().getFullPath(),
> +          ConflationKey key = new ConflationKey(gsEvent.getRegionPath(),
> -Dan
> On Thu, Aug 15, 2019 at 12:23 PM Udo Kohlmeyer <> wrote:
>> Whilst I agree with "*finish* when we believe the quality of the release
>> branch is sufficient", I disagree that we should have cut a branch and
>> continue to patch that branch with non-critical fixes. i.e this issue
>> has been around for a while and has no averse side effects. Issues like
>> GEODE-7081, which is new due to a new commit, AND it has critical
>> stability implications on the server, that I can agree we should include
>> in a potential release branch.
>> Otherwise we can ALWAYS argue that said release branch is not of
>> "sufficient" quality, especially if there are numerous existing JIRA's
>> pertaining to bugs already in the system.
>> To quote Juan's original email:
>> /"Note: *no events are lost (even without the fix)* but, if the region
>> takes//
>> //a while to recover, the logs  for the member can grow pretty quickly
>> due to//
>> //the continuously thrown *NPEs.*"/
>> In addition to this, if there is a commit in a cut release branch, which
>> is requiring us to continuously patching the release branch, in order to
>> stabilize that feature/fix, maybe we should consider reverting that fix
>> and release it at a later stage, when it is believed that this fix is
>> more stable and have better, more comprehensive test coverage.
>> So far, GEODE-7081, does not have me convinced that it is critical. OR
>> maybe it is the latter of my options, where it is a stabilization commit
>> to a new feature, which begs the question, should we have accepted the
>> original feature commit if there are all manner of side effects which we
>> are only discovering.
>> --Udo
>> On 8/15/19 11:08 AM, Anthony Baker wrote:
>>> While we can’t fix *all known bugs*, I think where we do have a fix for
>> an important issue we should think hard about the cost of not including
>> that in a release.
>>> IMO, the fixed time approach to releases means that we *start* the
>> release effort (including stabilization and bug fixing if needed) on a
>> known date and we *finish* when new believe the quality of the release
>> branch is sufficient.  Given the number of important fixes being requested,
>> I’m not sure we are there yet.
>>> I think the release branch concept has merit because it allows us to
>> isolate ongoing work from the changes needed for a release.
>>> +1 for including GEODE-7079.
>>> Anthony
>>>> On Aug 15, 2019, at 10:51 AM, Udo Kohlmeyer <>
>> wrote:
>>>> Seems everyone is in favor or including a /*non-critical*/ fix to an
>> already cut branch of the a potential release...
>>>> Am I missing something?
>>>> Why cut a release at all... just have a perpetual cycle of fixes added
>> to develop and users can chose what nightly snapshot build they would want
>> to use..
>>>> I'm voting -1 on a non-critical issue, which is existing and worst
>> effect is to fill logs will NPE logs... (yes, not something we want).
>>>> I believed that we (as a Geode community) agreed that once a release
>> has been cut, only critical issue fixes will be included. If we continue
>> just continually adding to the ALREADY CUT 1.10 release, where do we stop
>> and when do we release...
>>>> --Udo
>>>> On 8/15/19 10:19 AM, Nabarun Nag wrote:
>>>>> +1
>>>>> On Thu, Aug 15, 2019 at 10:15 AM Alexander Murmann <
>>>>> wrote:
>>>>>> +1
>>>>>> Agreed to fixing this. It's impossible for a user to discover they
>> hit an
>>>>>> edge case that we fail to support till they are in prod and restart.
>>>>>> On Thu, Aug 15, 2019 at 10:09 AM Juan José Ramos <>
>>>>>> wrote:
>>>>>>> Hello Udo,
>>>>>>> Even if it is an existing issue I'd still consider it critical
>> those
>>>>>>> cases on which there are unprocessed events on the persistent
>>>>>> after a
>>>>>>> restart and the region takes long to recover... you can actually
>>>>>>> millions of *NPEs* flooding the member's logs.
>>>>>>> My two cents anyway, it's up to the community to make the final
>> decision.
>>>>>>> Cheers.
>>>>>>> On Thu, Aug 15, 2019 at 5:58 PM Udo Kohlmeyer <>
>> wrote:
>>>>>>>> Juan,
>>>>>>>>    From your explanation, it seems this issue is existing
and not
>>>>>>>> critical. Could we possibly hold this for 1.11?
>>>>>>>> --Udo
>>>>>>>> On 8/15/19 5:29 AM, Ju@N wrote:
>>>>>>>>> Hello team,
>>>>>>>>> I'd like to propose including the *fix [1]* for *GEODE-7079
[2]* in
>>>>>>>> release
>>>>>>>>> 1.10.0.
>>>>>>>>> Long story short: a *NullPointerException* can be continuously
>> thrown
>>>>>>>>> and flood the member's logs if a serial event processor
>>>>>>>>> *async-event-queue* or *gateway-sender*) starts processing
>>>>>> from
>>>>>>> a
>>>>>>>>> recovered persistent queue before the actual region to
which it was
>>>>>>>>> attached is fully operational.
>>>>>>>>> Note: *no events are lost (even without the fix)* but,
if the
>> region
>>>>>>>> takes
>>>>>>>>> a while to recover, the logs  for the member can grow
>> quickly
>>>>>>> due
>>>>>>>> to
>>>>>>>>> the continuously thrown *NPEs.*
>>>>>>>>> Best regards.
>>>>>>>>> [1]:
>>>>>>>>> [2]:
>>>>>>> --
>>>>>>> Juan José Ramos Cassella
>>>>>>> Senior Software Engineer
>>>>>>> Email:

View raw message