zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-1669) Operations to server will be timed-out while thousands of sessions expired same time
Date Fri, 21 Jul 2017 05:45:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095812#comment-16095812
] 

ASF GitHub Bot commented on ZOOKEEPER-1669:
-------------------------------------------

Github user eribeiro commented on a diff in the pull request:

    https://github.com/apache/zookeeper/pull/312#discussion_r128688135
  
    --- Diff: src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java ---
    @@ -275,20 +307,9 @@ public synchronized void closeSession(long sessionId) {
     
         @SuppressWarnings("unchecked")
         private void closeSessionWithoutWakeup(long sessionId) {
    -        HashSet<NIOServerCnxn> cnxns;
    -        synchronized (this.cnxns) {
    -            cnxns = (HashSet<NIOServerCnxn>)this.cnxns.clone();
    -        }
    -
    -        for (NIOServerCnxn cnxn : cnxns) {
    -            if (cnxn.getSessionId() == sessionId) {
    -                try {
    -                    cnxn.close();
    -                } catch (Exception e) {
    -                    LOG.warn("exception during session close", e);
    -                }
    -                break;
    -            }
    +        NIOServerCnxn cnxn = sessionMap.remove(sessionId);
    +        if (cnxn != null) {
    +            cnxn.close();
    --- End diff --
    
    Why did you remove the `try-catch` block around `cnxn.close()`? We still can have exceptions
being thrown during `cnxn.close()`, right?


> Operations to server will be timed-out while thousands of sessions expired same time
> ------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1669
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1669
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: server
>    Affects Versions: 3.3.5
>            Reporter: tokoot
>            Assignee: Cheney Sun
>              Labels: performance
>
> If there are thousands of clients, and most of them disconnect with server same time(client
restarted or servers partitioned with clients), the server will busy to close those "connections"
and become unavailable. The problem is in following:
>   private void closeSessionWithoutWakeup(long sessionId) {
>       HashSet<NIOServerCnxn> cnxns;
>           synchronized (this.cnxns) {
>               cnxns = (HashSet<NIOServerCnxn>)this.cnxns.clone();  // other thread
will block because of here
>           }
>       ...
>   }



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message