zookeeper-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravi Kishore Valeti (Jira)" <j...@apache.org>
Subject [jira] [Comment Edited] (ZOOKEEPER-4235) Java Client SendThread does not clean up created objects during constructor of SaslClient and Login
Date Mon, 03 May 2021 09:37:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338280#comment-17338280

Ravi Kishore Valeti edited comment on ZOOKEEPER-4235 at 5/3/21, 9:36 AM:

[~dbwong], I am picking this up. Can some one please assign this to me?. I can't change the

was (Author: rvaleti):
[~dbwong], I am picking this up. Can some assign this to me?. I can't change the assignee.

> Java Client SendThread does not clean up created objects during constructor of SaslClient
and Login
> ---------------------------------------------------------------------------------------------------
>                 Key: ZOOKEEPER-4235
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4235
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: java client
>            Reporter: Daniel Wong
>            Priority: Major
> Hi I am an Apache Phoenix committer and I help manage many many zookeeper clusters at
my employment primarily using ZK for HBase use cases.  We recently had a production incident
where some of our ACLs were not setup preventing connectivity from the client to the ZK nodes
and the failure path exposed 2 issues to fix. This Jira and https://issues.apache.org/jira/browse/ZOOKEEPER-4236 . 
This Jira is the more important of the 2 and handles the failure observed in that we had a
FD/thread leak from the ZK java client send thread.  We had hundreds of threads per JVM with
the following stack trace.
> {code:java}
> java.lang.Thread.State: RUNNABLE at java.net.PlainSocketImpl.socketConnect(java.base@
Method) at java.net.AbstractPlainSocketImpl.doConnect(java.base@
- locked <0x00000015004fde20> (a java.net.SocksSocketImpl) at java.net.AbstractPlainSocketImpl.connectToAddress(java.base@
at java.net.AbstractPlainSocketImpl.connect(java.base@
at java.net.SocksSocketImpl.connect(java.base@ at java.net.Socket.connect(java.base@
at sun.security.krb5.internal.TCPClient.<init>(java.security.jgss@
at sun.security.krb5.internal.NetClient.getInstance(java.security.jgss@
at sun.security.krb5.KdcComm$KdcCommunication.run(java.security.jgss@
at sun.security.krb5.KdcComm$KdcCommunication.run(java.security.jgss@
at java.security.AccessController.doPrivileged(java.base@ Method) at sun.security.krb5.KdcComm.send(java.security.jgss@
at sun.security.krb5.KdcComm.sendIfPossible(java.security.jgss@
at sun.security.krb5.KdcComm.send(java.security.jgss@ at sun.security.krb5.KdcComm.send(java.security.jgss@
at sun.security.krb5.KrbAsReqBuilder.send(java.security.jgss@
at sun.security.krb5.KrbAsReqBuilder.action(java.security.jgss@
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(jdk.security.auth@
at com.sun.security.auth.module.Krb5LoginModule.login(jdk.security.auth@
at javax.security.auth.login.LoginContext.invoke(java.base@
at javax.security.auth.login.LoginContext$4.run(java.base@
at javax.security.auth.login.LoginContext$4.run(java.base@
at java.security.AccessController.doPrivileged(java.base@ Method) at javax.security.auth.login.LoginContext.invokePriv(java.base@
at javax.security.auth.login.LoginContext.login(java.base@
at org.apache.zookeeper.Login.login(Login.java:304) - locked <0x000000151c477148> (a
org.apache.zookeeper.Login) at org.apache.zookeeper.Login.<init>(Login.java:106) at
- locked <0x000000151c476f68> (a org.apache.zookeeper.client.ZooKeeperSaslClient) at
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:972) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1031)
> {code}
> Note that today ZooKeeperSaslClient as well as Login both allocate resources in their
constructors and thus cannot be cleaned up or interrupted via close/shutdown/disconnect of
their parents due to still being a null object during initialization.  This leaves the thread/sockets
at the mercy of the configured kdc retry/timeout configuration.
> This Jira is intended to break the constructor and the initialization path into separate
methods and properly clean up the resulting objects.

This message was sent by Atlassian Jira

View raw message