zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andor Molnár <an...@apache.org>
Subject Re: JAVA 11 build is broken on 3.5
Date Wed, 09 Jan 2019 21:18:01 GMT
Thank you guys for your support.

I leave this open for the community to decide, but if none of the
binding voters (committers, PMCs) respond in 1 week, I'll take
responsibility and commit the patches. I know holidays are still going
on, but 3.5 release is important we should make progress.


Thanks,

Andor



On 1/9/19 21:54, Norbert Kalmar wrote:
> I agree with backporting Netty 4 and the related patches to 3.5.5.
> +1 (non binding) (and I might have already voted)
>
> If it took so long to release a stable 3.5, I don't think we should leave a
> known bug in the system, which has such an impact (not supporting current
> LTS java).
> The PRs shouldn't be hard to backport, no new development is required, and
> unfortunately maven migration isn't finished yet, so it's not like we need
> to postpone the release just because of Netty upgrade. That's my two cents
> anyway.
>
> Regards,
> Norbert
>
> On Wed, Jan 9, 2019 at 9:19 PM Enrico Olivelli <eolivelli@gmail.com> wrote:
>
>> Il giorno mer 9 gen 2019 alle ore 21:16 Andor Molnar
>> <andor@apache.org> ha scritto:
>>> So, you guys saying we should upgrade to Netty 4 and backport the
>> following patches too:
>>> ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
>>> ZOOKEEPER-3163 Use session map to improve the performance when closing
>> session in Netty
>>> ZOOKEEPER-3146 Limit the maximum client connections per IP in
>> NettyServerCnxnFactory
>>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep
>> the same behavior and make the code easier to maintain
>>> I think it’s still deliverable with the stable release. Regarding the
>> timing, let’s say we release 3.5.5 with Netty 3 and do the upgrade in
>> 3.5.6. Is it acceptable to do such a big with a minor version change?
>>
>> I think we can do the upgrade in 3.5.5.
>> We can't say Java 11 is supported if we don't have tests working. And
>> Java 11 is the current LTS release from Oracle.
>>
>> If we release now 3.5.5 then we need to work for 3.5.6 and then
>> release....it is a double work, we don't have so much resources :-(
>>
>> IMHO it is better to release 3.5.5 and start focusing on 3.6.0
>>
>> Enrico
>>
>>> Upgrading now isn't ideal either, but maybe somewhat less brutal.
>>>
>>> Regards,
>>> Andor
>>>
>>>
>>>
>>>
>>>> On 2019. Jan 8., at 2:10, Brian Nixon <brian.nixon.cs@gmail.com>
>> wrote:
>>>> Also worth consideration for the Netty fixes backport bundle:
>>>>
>>>> ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
>>>>
>>>> Our experience with Netty has been positive so far though I don't
>> think we
>>>> have enough info to declare it is stable. It makes a lot of sense to
>> me for
>>>> 3.5 to use Netty 4, the biggest question is the timing.
>>>>
>>>>
>>>> On Fri, Jan 4, 2019 at 10:44 AM Andor Molnár <andor@apache.org> wrote:
>>>>
>>>>> I got another one:
>>>>>
>>>>> ZOOKEEPER-3163 Use session map to improve the performance when closing
>>>>> session in Netty
>>>>>
>>>>>
>>>>> So, that doesn't seem too many. We can talk about backporting them,
>> but
>>>>> I don't think they're super critical for the first stable release.
>>>>>
>>>>> Regarding 3.4, I need to validate the embedded scenario, but at least
>>>>> the Java 11 build is green. :)
>>>>>
>>>>>
>>>>> Andor
>>>>>
>>>>>
>>>>>
>>>>> On 1/4/19 18:08, Enrico Olivelli wrote:
>>>>>> Il giorno ven 4 gen 2019 alle ore 17:04 Norbert Kalmar
>>>>>> <nkalmar@cloudera.com.invalid> ha scritto:
>>>>>>> +1 for Netty 4 in 3.5
>>>>>>>
>>>>>>> Pretty much all the pros and cons has been said before me.
>>>>>>> I would only add that this is not a new functionality that we
wan't
>> to
>>>>>>> backport. It's a criticall(ish?) bugfix, which requires quite
a bit
>> of
>>>>>>> change unfortunately.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Norbert
>>>>>>>
>>>>>>> On Fri, Jan 4, 2019 at 4:36 PM Andor Molnar <andor@apache.org>
>> wrote:
>>>>>>>> What are those patches exactly?
>>>>>>>>
>>>>>>>> Comparing the ported version of 3.5 with master I’ve only
found 2
>>>>> patches
>>>>>>>> which are missing:
>>>>>>>>
>>>>>>>> ZOOKEEPER-3146 Limit the maximum client connections per IP
in
>>>>>>>> NettyServerCnxnFactory
>>>>>>>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and
Netty to
>> keep
>>>>>>>> the same behavior and make the code easier to maintain
>>>>>>>>
>>>>>>>> None of them are critical I would say.
>>>>>>>> Is there anything else I’m missing?
>>>>>> I have compared the histories and I think Andor you are right.
>>>>>>
>>>>>> master:
>>>>>> https://github.com/apache/zookeeper/commits/master
>>>>>>
>>>>>> branch-3.5:
>>>>>> https://github.com/apache/zookeeper/commits/branch-3.5
>>>>>>
>>>>>> Sorry I was thinking to the amount of patches related to TLS stuff
,
>>>>>> but they are not related to Netty 4.
>>>>>>
>>>>>> What about branch 3.4 ?
>>>>>> We don't have reconfig but the case "Start embedded ZK, stop it and
>>>>>> try to restart on the same port" should apply as well.
>>>>>>
>>>>>> Enrico
>>>>>>
>>>>>>>> Andor
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> On 2019. Jan 4., at 16:27, Enrico Olivelli <eolivelli@gmail.com>
>>>>> wrote:
>>>>>>>>> Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar
>>>>>>>>> <andor@apache.org <mailto:andor@apache.org>>
ha scritto:
>>>>>>>>>> Hi team / Enrico,
>>>>>>>>>>
>>>>>>>>>> I’d like to get feedback from the community on
the following
>> patch
>>>>>>>> (moving the discussion from GitHub to here):
>>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204
<
>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204>
>>>>>>>>>> https://github.com/apache/zookeeper/pull/753 <
>>>>>>>> https://github.com/apache/zookeeper/pull/753>
>>>>>>>>>> In a nutshell: looks like that Netty 3.10 is broken
under Java
>> 11: it
>>>>>>>> doesn’t properly close the underlying socket (probably
not closing
>> the
>>>>>>>> registered NIO selectors) and reconfig tests are unable to
re-bind
>> the
>>>>>>>> ports. This problem is similar that we already fixed in NIO
with
>> the
>>>>>>>> following patch:
>>>>>>>>
>> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
>>>>>>>> <
>>>>>>>>
>> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2
>>>>>>>>>> The problem doesn’t show up on trunk which has
been recently
>> upgraded
>>>>>>>> to Netty 4.
>>>>>>>>>> Repro:
>>>>>>>>>> - Start embedded ZK, stop it and try to restart on
the same
>> port, or
>>>>>>>>>> - Start normal ensemble and reconfig to use different
(client)
>> port.
>>>>>>>> Then reconfig back to the original port which should fail.
(that’s
>> the
>>>>>>>> scenario which is covered in ReconfigTest)
>>>>>>>>>> I created the above patch (#753) to backport Netty
4 upgrade to
>> 3.5
>>>>> and
>>>>>>>> it fixes the problem with Java 11 (it doesn’t cause regression
in
>> the
>>>>>>>> pre-commit build either), but Enrico is having concerns about
>> making
>>>>> such
>>>>>>>> big change before the release.
>>>>>>>>>> I tend to agree, but let’s see what are the options.
>>>>>>>>>>
>>>>>>>>>> Thoughts:
>>>>>>>>>> - Do we have to fix this? - Yes. Java 11 is LTS and
I the bug is
>>>>>>>> critical.
>>>>>>>>>> - Can we fix Netty 3? - Maybe. Let’s say we find
the bug in
>> Netty 3,
>>>>>>>> what can we do?
>>>>>>>>>>      a) We cannot workaround from ZooKeeper itself
and have to
>>>>> submit
>>>>>>>> a pull request for Netty. I think it’s quite unlikely that
they
>> will
>>>>> accept
>>>>>>>> the change given it’s not a security bug, but even if they
did,
>> only
>>>>> the
>>>>>>>> upgraded version of Netty 3 would work properly with ZooKeeper.
>> Err.
>>>>>>>>>>      b) We can workaround it from ZooKeeper: that
could be option
>>>>> #1,
>>>>>>>> but I have a strong feeling about it’s not going to be
the case.
>>>>>>>>>> - Shall we upgrade to Netty 4? - this is option #2
>>>>>>>>>>
>>>>>>>>>> Please share your thoughts, maybe you know about
an option #3.
>>>>>>>>> Thank you Andor
>>>>>>>>>
>>>>>>>>> I have thought more about this problem, and I have checked
that
>> Netty
>>>>>>>>> 3 is really dead/unmantained (last release in 2016).
>>>>>>>>> If I understand correctly there is no easy workaround
(nothing
>> without
>>>>>>>>> hacking Netty 3 internals)
>>>>>>>>>
>>>>>>>>> As soon as we will declare 3.5.5 "stable" the world will
hopefully
>>>>>>>>> abandon 3.4 and switch to 3.5 + Netty (because of SSL
support).
>>>>>>>>> The network stack is very important so it is better to
have Netty
>> 4 as
>>>>>>>>> foundation, I am thinking about security issues, we won't
make an
>>>>>>>>> "hotfix" release with the switch to Netty 4 because there
is a
>> bad bug
>>>>>>>>> in Netty 3.
>>>>>>>>> So better to switch now.
>>>>>>>>>
>>>>>>>>> But Facebooks friends, expecially @ivmaykov did a lot
of bugfixes
>>>>>>>>> around Netty on master branch, we must be sure that what
we are
>>>>>>>>> delivering in 3.5.5 is stable.
>>>>>>>>>
>>>>>>>>> We will also have to state clearly in the "release notes"
that
>> Netty
>>>>>>>>> version is changed, as this may have a non trivial impact
to
>> memory
>>>>>>>>> usage (i.e. Netty 4 uses more Direct memory by default)
>>>>>>>>>
>>>>>>>>> So to recap my final opinion: +1 to switch to Netty 4
if we take
>> care
>>>>>>>>> of port all of the fixes around Netty 4 from master branch
and we
>>>>>>>>> state the switch clearly in the release notes
>>>>>>>>>
>>>>>>>>> Enrico
>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Andor

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message