zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Abhishek Singh Chouhan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-3059) EventThread leak in case of Sasl AuthFailed
Date Tue, 12 Jun 2018 06:32:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509245#comment-16509245

Abhishek Singh Chouhan commented on ZOOKEEPER-3059:

[~andorm] Thanks for pointing out the jira. I was earlier thinking about taking a similar
approach and making the close call less restrictive so as to work in the case of auth failed
too, however going through the documentation a bit i see that auth_failed is considered similar
to session expired (both are fatal events) and for session expire we do kill the event thread
automatically, hence i went with a similar approach to kill the event thread in case of auth
failed too, rather than leaving it to the user (by changing the close method and expecting
the event thread to be shutdown when that is called post auth failed). What do you think? 

> EventThread leak in case of Sasl AuthFailed
> -------------------------------------------
>                 Key: ZOOKEEPER-3059
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3059
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.4.12
>            Reporter: Abhishek Singh Chouhan
>            Assignee: Abhishek Singh Chouhan
>            Priority: Critical
>              Labels: pull-request-available
>         Attachments: stack_dump
>          Time Spent: 10m
>  Remaining Estimate: 0h
> In case of an authFailed sasl event we shutdown the send thread however we never close
the event thread. Even if the client tries to close the connection it results in a no-op since
we check for cnxn.getState().isAlive() which results in negative for auth failed state and
we return without cleaning up. For applications that retry in case of auth failed by closing
the existing connection and then trying to reconnect(eg. hbase replication) this eventually
ends up exhausting the system resources.

This message was sent by Atlassian JIRA

View raw message