zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Han (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-2867) an expired ZK session can be re-established
Date Tue, 08 Aug 2017 04:49:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117850#comment-16117850
] 

Michael Han commented on ZOOKEEPER-2867:
----------------------------------------

In zk.1 log, looks like server 1 (which was leader a.t.m.) was cut off right after it proposing
expiring session 25cd1e82c110001:

{noformat}
June 27th 2017, 22:04:39.001   DEBUG org.apache.zookeeper.server.quorum.Leader Proposing::
sessionid:0x25cd1e82c110001 type:closeSession cxid:0x0 zxid:0x200000fc4 txntype:-11 reqpath:n/a

June 27th 2017, 22:04:39.076   ERROR org.apache.zookeeper.server.quorum.LearnerHandler Unexpected
exception causing shutdown while sock still open
{noformat}

And then server 1 went into leader election and eventually became a follower. So the session
expiring proposal never went through quorum thus the session 25cd1e82c110001 was not expired.
A more convincing proof regarding the results of session expiration would be to search all
the log for something like "Committing request:: sessionid:0x25cd1e82c110001" - if we have
a hit then that means the session expiring request reached a consensus and got committed.


Regarding the follow up question, if the session was not expired, then it is possible at least
in one case that we will see the "logging in ZK server about deleting an ephemeral in that
session.". That case is when a server receives session expiration request, it persists the
session close request locally to tnx log file, and then got cut off from rest of quorum. When
this server restarts, it will load its data tree from persistent log files, then replay session
close request which leads to delete ephemerals. Later when this server joins the quorum its
state will be synced with leader, which does not have the close session transaction (otherwise
the session would be expired already at quorum level), so leader will tell this server to
truncate its log by dropping the session close request, and the server will do so by truncate
and reloading logs, which eventually bring its data tree on par with rest of quorum. 

> an expired ZK session can be re-established
> -------------------------------------------
>
>                 Key: ZOOKEEPER-2867
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2867
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.4.10
>            Reporter: Jun Rao
>         Attachments: zk.0.formatted, zk.1.formatted
>
>
> Not sure if this is a real bug, but I found an instance when a ZK client seems to be
able to renew a session already expired by the ZK server.
> From ZK server log, session 25cd1e82c110001 was expired at 22:04:39.
> {code:java}
> June 27th 2017, 22:04:39.000	INFO	org.apache.zookeeper.server.ZooKeeperServer	Expiring
session 0x25cd1e82c110001, timeout of 12000ms exceeded
> June 27th 2017, 22:04:39.001	DEBUG	org.apache.zookeeper.server.quorum.Leader	Proposing::
sessionid:0x25cd1e82c110001 type:closeSession cxid:0x0 zxid:0x200000fc4 txntype:-11 reqpath:n/a
> June 27th 2017, 22:04:39.001	INFO	org.apache.zookeeper.server.PrepRequestProcessor	Processed
session termination for sessionid: 0x25cd1e82c110001
> June 27th 2017, 22:04:39.001	DEBUG	org.apache.zookeeper.server.quorum.CommitProcessor
Processing request:: sessionid:0x25cd1e82c110001 type:closeSession cxid:0x0 zxid:0x200000fc4
txntype:-11 reqpath:n/a
> June 27th 2017, 22:05:20.324	INFO	org.apache.zookeeper.server.quorum.Learner	Revalidating
client: 0x25cd1e82c110001
> June 27th 2017, 22:05:20.324	INFO	org.apache.zookeeper.server.ZooKeeperServer	Client
attempting to renew session 0x25cd1e82c110001 at /100.96.5.6:47618
> June 27th 2017, 22:05:20.325	INFO	org.apache.zookeeper.server.ZooKeeperServer	Established
session 0x25cd1e82c110001 with negotiated timeout 12000 for client /100.96.5.6:47618
> {code}
> From ZK client's log, it was able to renew the expired session on 22:05:20.
> {code:java}
> June 27th 2017, 22:05:18.590	INFO	org.apache.zookeeper.ClientCnxn	Client session timed
out, have not heard from server in 4485ms for sessionid 0x25cd1e82c110001, closing socket
connection and attempting reconnect	0
> June 27th 2017, 22:05:18.590	WARN	org.apache.zookeeper.ClientCnxn	Client session timed
out, have not heard from server in 4485ms for sessionid 0x25cd1e82c110001	0
> June 27th 2017, 22:05:19.325	WARN	org.apache.zookeeper.ClientCnxn	SASL configuration
failed: javax.security.auth.login.LoginException: No JAAS configuration section named 'Client'
was found in specified JAAS configuration file: '/opt/confluent/etc/kafka/server_jaas.conf'.
Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server
allows it.	0
> June 27th 2017, 22:05:19.326	INFO	org.apache.zookeeper.ClientCnxn	Opening socket connection
to server 100.65.188.168/100.65.188.168:2181	0
> June 27th 2017, 22:05:20.324	INFO	org.apache.zookeeper.ClientCnxn	Socket connection established
to 100.65.188.168/100.65.188.168:2181, initiating session	0
> June 27th 2017, 22:05:20.327	INFO	org.apache.zookeeper.ClientCnxn	Session establishment
complete on server 100.65.188.168/100.65.188.168:2181, sessionid = 0x25cd1e82c110001, negotiated
timeout = 12000	0
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message