zookeeper-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colvin Cowie (Jira)" <j...@apache.org>
Subject [jira] [Updated] (ZOOKEEPER-4296) NullPointerException when ClientCnxnSocketNetty is closed without being opened
Date Mon, 17 May 2021 23:14:00 GMT

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Colvin Cowie updated ZOOKEEPER-4296:
------------------------------------
    Description: 
I believe this bug was originally reported as ZOOKEEPER-2966 but that was closed as not reproducible
in February 2019. I left a comment with these details on that issue in December. I can create
a PR with a fix at some point this week.

 

In ZooKeeper 3.6.2, in the context of the SolrJ client, we hit the NPE reported on ZOOKEEPER-2966
when a DNS error causes an exception after the SolrZkClient trys to connect to ZooKeeper,
but then immediately calls close on the {{ClientCnxn}} [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/cloud/SolrZkClient.java#L158-L204].
{noformat}
java.lang.NullPointerException: null
        at org.apache.zookeeper.ClientCnxnSocketNetty.onClosing(ClientCnxnSocketNetty.java:247)
~[zookeeper-3.6.2.jar:3.6.2]
        at org.apache.zookeeper.ClientCnxn$SendThread.close(ClientCnxn.java:1445) ~[zookeeper-3.6.2.jar:3.6.2]
        at org.apache.zookeeper.ClientCnxn.disconnect(ClientCnxn.java:1488) ~[zookeeper-3.6.2.jar:3.6.2]
        at org.apache.zookeeper.ClientCnxn.close(ClientCnxn.java:1517) ~[zookeeper-3.6.2.jar:3.6.2]
        at org.apache.zookeeper.ZooKeeper.close(ZooKeeper.java:1614) ~[zookeeper-3.6.2.jar:3.6.2]
        at org.apache.solr.common.cloud.SolrZooKeeper.close(SolrZooKeeper.java:97) ~[solr-solrj-8.7.0.jar:8.7.0
2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
        at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:198) ~[solr-solrj-8.7.0.jar:8.7.0
2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
        at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:127) ~[solr-solrj-8.7.0.jar:8.7.0
2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
        at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:122) ~[solr-solrj-8.7.0.jar:8.7.0
2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
        at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:109) ~[solr-solrj-8.7.0.jar:8.7.0
2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
{noformat}
This happens if the {{ClientCnxnSocketNetty}}'s {{onClosing()}} is called before {{connect(...)}}
(or if connect isn't called at all) because the {{firstConnect}} {{CountDownLatch}} is only
initialized in {{connect(...)}}.
 [https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L129]
 [https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L247]
 A null check in {{onClosing()}} will fix it, but I don't know if there's any greater change
required, e.g. some synchronization around connect and onClosing.

The code in [3.5.3|https://github.com/apache/zookeeper/blame/1507f67a06175155003722297daeb60bc912af1d/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L206]
looks very similar, it looks like it's been present since the initial commit of {{ClientCnxnSocketNetty}}.

  was:
I believe this bug was originally reported as ZOOKEEPER-2966 but that was closed as not reproducible
in February 2019. I left a comment with these details on that issue in December. I can create
a PR with a fix at some point this week.

 

In ZooKeeper 3.6.2, in the context of the SolrJ client, we hit the NPE reported on ZOOKEEPER-2966
when a DNS error causes an exception after the SolrZkClient trys to connect to ZooKeeper,
but then immediately calls close on the {{ClientCnxn}} [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/cloud/SolrZkClient.java#L158-L204].
{noformat}
java.lang.NullPointerException: null
        at org.apache.zookeeper.ClientCnxnSocketNetty.onClosing(ClientCnxnSocketNetty.java:247)
~[zookeeper-3.6.2.jar:3.6.2]
        at org.apache.zookeeper.ClientCnxn$SendThread.close(ClientCnxn.java:1445) ~[zookeeper-3.6.2.jar:3.6.2]
        at org.apache.zookeeper.ClientCnxn.disconnect(ClientCnxn.java:1488) ~[zookeeper-3.6.2.jar:3.6.2]
        at org.apache.zookeeper.ClientCnxn.close(ClientCnxn.java:1517) ~[zookeeper-3.6.2.jar:3.6.2]
        at org.apache.zookeeper.ZooKeeper.close(ZooKeeper.java:1614) ~[zookeeper-3.6.2.jar:3.6.2]
        at org.apache.solr.common.cloud.SolrZooKeeper.close(SolrZooKeeper.java:97) ~[solr-solrj-8.7.0.jar:8.7.0
2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
        at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:198) ~[solr-solrj-8.7.0.jar:8.7.0
2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
        at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:127) ~[solr-solrj-8.7.0.jar:8.7.0
2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
        at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:122) ~[solr-solrj-8.7.0.jar:8.7.0
2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
        at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:109) ~[solr-solrj-8.7.0.jar:8.7.0
2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
{noformat}
This happens if the {{ClientCnxnSocketNetty}}'s {{onClosing()}} is called before {{connect(...)}}
(or if connect isn't called at all) because the {{firstConnect}} {{CountDownLatch}} is only
initialized in {{connect(...)}}.
 [https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L129]
 [https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L247]
 A null check in {{onClosing()}} will fix it, but I don't know if there's any greater change
required, e.g. some synchronization around connect and onClosing.

The code in [3.5.3|https://github.com/apache/zookeeper/blame/1507f67a06175155003722297daeb60bc912af1d/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L206]
looks very similar.


> NullPointerException when ClientCnxnSocketNetty is closed without being opened
> ------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-4296
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4296
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.5.9, 3.5.3, 3.6.3, 3.6.2
>            Reporter: Colvin Cowie
>            Priority: Minor
>
> I believe this bug was originally reported as ZOOKEEPER-2966 but that was closed as
not reproducible in February 2019. I left a comment with these details on that issue in December.
I can create a PR with a fix at some point this week.
>  
> In ZooKeeper 3.6.2, in the context of the SolrJ client, we hit the NPE reported on ZOOKEEPER-2966
when a DNS error causes an exception after the SolrZkClient trys to connect to ZooKeeper,
but then immediately calls close on the {{ClientCnxn}} [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/cloud/SolrZkClient.java#L158-L204].
> {noformat}
> java.lang.NullPointerException: null
>         at org.apache.zookeeper.ClientCnxnSocketNetty.onClosing(ClientCnxnSocketNetty.java:247)
~[zookeeper-3.6.2.jar:3.6.2]
>         at org.apache.zookeeper.ClientCnxn$SendThread.close(ClientCnxn.java:1445) ~[zookeeper-3.6.2.jar:3.6.2]
>         at org.apache.zookeeper.ClientCnxn.disconnect(ClientCnxn.java:1488) ~[zookeeper-3.6.2.jar:3.6.2]
>         at org.apache.zookeeper.ClientCnxn.close(ClientCnxn.java:1517) ~[zookeeper-3.6.2.jar:3.6.2]
>         at org.apache.zookeeper.ZooKeeper.close(ZooKeeper.java:1614) ~[zookeeper-3.6.2.jar:3.6.2]
>         at org.apache.solr.common.cloud.SolrZooKeeper.close(SolrZooKeeper.java:97) ~[solr-solrj-8.7.0.jar:8.7.0
2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:39:18]
>         at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:198)
~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29
19:39:18]
>         at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:127)
~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29
19:39:18]
>         at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:122)
~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29
19:39:18]
>         at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:109)
~[solr-solrj-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29
19:39:18]
> {noformat}
> This happens if the {{ClientCnxnSocketNetty}}'s {{onClosing()}} is called before {{connect(...)}}
(or if connect isn't called at all) because the {{firstConnect}} {{CountDownLatch}} is only
initialized in {{connect(...)}}.
>  [https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L129]
>  [https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L247]
>  A null check in {{onClosing()}} will fix it, but I don't know if there's any greater
change required, e.g. some synchronization around connect and onClosing.
> The code in [3.5.3|https://github.com/apache/zookeeper/blame/1507f67a06175155003722297daeb60bc912af1d/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java#L206]
looks very similar, it looks like it's been present since the initial commit of {{ClientCnxnSocketNetty}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message