zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andor Molnar <an...@apache.org>
Subject Re: JAVA 11 build is broken on 3.5
Date Wed, 09 Jan 2019 20:16:04 GMT
So, you guys saying we should upgrade to Netty 4 and backport the following patches too:

ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
ZOOKEEPER-3163 Use session map to improve the performance when closing session in Netty
ZOOKEEPER-3146 Limit the maximum client connections per IP in NettyServerCnxnFactory
ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep the same behavior
and make the code easier to maintain

I think it’s still deliverable with the stable release. Regarding the timing, let’s say
we release 3.5.5 with Netty 3 and do the upgrade in 3.5.6. Is it acceptable to do such a big
with a minor version change?

Upgrading now isn't ideal either, but maybe somewhat less brutal.

Regards,
Andor




> On 2019. Jan 8., at 2:10, Brian Nixon <brian.nixon.cs@gmail.com> wrote:
> 
> Also worth consideration for the Netty fixes backport bundle:
> 
> ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
> 
> Our experience with Netty has been positive so far though I don't think we
> have enough info to declare it is stable. It makes a lot of sense to me for
> 3.5 to use Netty 4, the biggest question is the timing.
> 
> 
> On Fri, Jan 4, 2019 at 10:44 AM Andor Molnár <andor@apache.org> wrote:
> 
>> I got another one:
>> 
>> ZOOKEEPER-3163 Use session map to improve the performance when closing
>> session in Netty
>> 
>> 
>> So, that doesn't seem too many. We can talk about backporting them, but
>> I don't think they're super critical for the first stable release.
>> 
>> Regarding 3.4, I need to validate the embedded scenario, but at least
>> the Java 11 build is green. :)
>> 
>> 
>> Andor
>> 
>> 
>> 
>> On 1/4/19 18:08, Enrico Olivelli wrote:
>>> Il giorno ven 4 gen 2019 alle ore 17:04 Norbert Kalmar
>>> <nkalmar@cloudera.com.invalid> ha scritto:
>>>> +1 for Netty 4 in 3.5
>>>> 
>>>> Pretty much all the pros and cons has been said before me.
>>>> I would only add that this is not a new functionality that we wan't to
>>>> backport. It's a criticall(ish?) bugfix, which requires quite a bit of
>>>> change unfortunately.
>>>> 
>>>> Regards,
>>>> Norbert
>>>> 
>>>> On Fri, Jan 4, 2019 at 4:36 PM Andor Molnar <andor@apache.org> wrote:
>>>> 
>>>>> What are those patches exactly?
>>>>> 
>>>>> Comparing the ported version of 3.5 with master I’ve only found 2
>> patches
>>>>> which are missing:
>>>>> 
>>>>> ZOOKEEPER-3146 Limit the maximum client connections per IP in
>>>>> NettyServerCnxnFactory
>>>>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep
>>>>> the same behavior and make the code easier to maintain
>>>>> 
>>>>> None of them are critical I would say.
>>>>> Is there anything else I’m missing?
>>> I have compared the histories and I think Andor you are right.
>>> 
>>> master:
>>> https://github.com/apache/zookeeper/commits/master
>>> 
>>> branch-3.5:
>>> https://github.com/apache/zookeeper/commits/branch-3.5
>>> 
>>> Sorry I was thinking to the amount of patches related to TLS stuff ,
>>> but they are not related to Netty 4.
>>> 
>>> What about branch 3.4 ?
>>> We don't have reconfig but the case "Start embedded ZK, stop it and
>>> try to restart on the same port" should apply as well.
>>> 
>>> Enrico
>>> 
>>>>> Andor
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 2019. Jan 4., at 16:27, Enrico Olivelli <eolivelli@gmail.com>
>> wrote:
>>>>>> 
>>>>>> Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar
>>>>>> <andor@apache.org <mailto:andor@apache.org>> ha scritto:
>>>>>>> Hi team / Enrico,
>>>>>>> 
>>>>>>> I’d like to get feedback from the community on the following
patch
>>>>> (moving the discussion from GitHub to here):
>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 <
>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204>
>>>>>>> https://github.com/apache/zookeeper/pull/753 <
>>>>> https://github.com/apache/zookeeper/pull/753>
>>>>>>> In a nutshell: looks like that Netty 3.10 is broken under Java
11: it
>>>>> doesn’t properly close the underlying socket (probably not closing
the
>>>>> registered NIO selectors) and reconfig tests are unable to re-bind the
>>>>> ports. This problem is similar that we already fixed in NIO with the
>>>>> following patch:
>>>>> 
>> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
>>>>> <
>>>>> 
>> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
>>>>>>> The problem doesn’t show up on trunk which has been recently
upgraded
>>>>> to Netty 4.
>>>>>>> Repro:
>>>>>>> - Start embedded ZK, stop it and try to restart on the same port,
or
>>>>>>> - Start normal ensemble and reconfig to use different (client)
port.
>>>>> Then reconfig back to the original port which should fail. (that’s
the
>>>>> scenario which is covered in ReconfigTest)
>>>>>>> I created the above patch (#753) to backport Netty 4 upgrade
to 3.5
>> and
>>>>> it fixes the problem with Java 11 (it doesn’t cause regression in the
>>>>> pre-commit build either), but Enrico is having concerns about making
>> such
>>>>> big change before the release.
>>>>>>> I tend to agree, but let’s see what are the options.
>>>>>>> 
>>>>>>> Thoughts:
>>>>>>> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug
is
>>>>> critical.
>>>>>>> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in
Netty 3,
>>>>> what can we do?
>>>>>>>      a) We cannot workaround from ZooKeeper itself and have to
>> submit
>>>>> a pull request for Netty. I think it’s quite unlikely that they will
>> accept
>>>>> the change given it’s not a security bug, but even if they did, only
>> the
>>>>> upgraded version of Netty 3 would work properly with ZooKeeper. Err.
>>>>>>>      b) We can workaround it from ZooKeeper: that could be option
>> #1,
>>>>> but I have a strong feeling about it’s not going to be the case.
>>>>>>> - Shall we upgrade to Netty 4? - this is option #2
>>>>>>> 
>>>>>>> Please share your thoughts, maybe you know about an option #3.
>>>>>> Thank you Andor
>>>>>> 
>>>>>> I have thought more about this problem, and I have checked that Netty
>>>>>> 3 is really dead/unmantained (last release in 2016).
>>>>>> If I understand correctly there is no easy workaround (nothing without
>>>>>> hacking Netty 3 internals)
>>>>>> 
>>>>>> As soon as we will declare 3.5.5 "stable" the world will hopefully
>>>>>> abandon 3.4 and switch to 3.5 + Netty (because of SSL support).
>>>>>> The network stack is very important so it is better to have Netty
4 as
>>>>>> foundation, I am thinking about security issues, we won't make an
>>>>>> "hotfix" release with the switch to Netty 4 because there is a bad
bug
>>>>>> in Netty 3.
>>>>>> So better to switch now.
>>>>>> 
>>>>>> But Facebooks friends, expecially @ivmaykov did a lot of bugfixes
>>>>>> around Netty on master branch, we must be sure that what we are
>>>>>> delivering in 3.5.5 is stable.
>>>>>> 
>>>>>> We will also have to state clearly in the "release notes" that Netty
>>>>>> version is changed, as this may have a non trivial impact to memory
>>>>>> usage (i.e. Netty 4 uses more Direct memory by default)
>>>>>> 
>>>>>> So to recap my final opinion: +1 to switch to Netty 4 if we take
care
>>>>>> of port all of the fixes around Netty 4 from master branch and we
>>>>>> state the switch clearly in the release notes
>>>>>> 
>>>>>> Enrico
>>>>>> 
>>>>>>> Regards,
>>>>>>> Andor
>>>>> 
>> 


Mime
View raw message