zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Han (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-2867) an expired ZK session can be re-established
Date Thu, 31 May 2018 00:14:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16495919#comment-16495919
] 

Michael Han commented on ZOOKEEPER-2867:
----------------------------------------

[~junrao] - ok, finally get back to this :=).

I checked the broker 1 and 2's commit log, session {{0x25cd1e82c110001}} was referenced after
the "session close" log statement. For example, in server 1's log file, the first reference
of the session 0x25cd1e82c110001 after it's "closed" is:
{quote}6/27/17 3:05:20 PM PDT session 0x25cd1e82c110001 cxid 0xdc zxid 0x30000047b setData
'/cp15/brokers/topics/xxxx/state,#7b22636f6e74726f6c6c65725f65706f6368223a312c226c6561646572223a302c2276657273696f6e223a312c226c65616465725f65706f6368223a352c22697372223a5b302c315d7d,10
{quote}
Similarly for server 2's commit log session 0x25cd1e82c110001 is referenced after the "closed"
statement as well.

I think in this case session 0x25cd1e82c110001 is not get closed. It could be that the close
requests were sent to server 1 and 2 and then something happened so the close requests never
committed. Note the "close session" entry in the commit log is missing some context so I can't
really tell where this log statement was from - if we have more detailed log such to indicate
where this log statement was coming from, I might be able to tell the state of the server
at that time. 

One way to make sure the close session proposal really gets committed across the quorum is
to looking for this log statement (need to enable debug logging): [https://github.com/apache/zookeeper/blob/branch-3.4/src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java#L164]

At this point, the proposal reached consensus and is persistent across a quorum of nodes (the
state change might not yet applied to the data tree, but as long as the proposal is persistent
we are fine). 

> an expired ZK session can be re-established
> -------------------------------------------
>
>                 Key: ZOOKEEPER-2867
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2867
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.4.10
>            Reporter: Jun Rao
>            Assignee: Michael Han
>            Priority: Major
>         Attachments: 0.parsed_commit_log, 1.parsed_commit_log, 2.parsed_commit_log, zk.0.08-02,
zk.0.08-02, zk.0.formatted, zk.1.08-02, zk.1.08-02, zk.1.formatted, zk.2.08-02, zk.2.08-02
>
>
> Not sure if this is a real bug, but I found an instance when a ZK client seems to be
able to renew a session already expired by the ZK server.
> From ZK server log, session 25cd1e82c110001 was expired at 22:04:39.
> {code:java}
> June 27th 2017, 22:04:39.000	INFO	org.apache.zookeeper.server.ZooKeeperServer	Expiring
session 0x25cd1e82c110001, timeout of 12000ms exceeded
> June 27th 2017, 22:04:39.001	DEBUG	org.apache.zookeeper.server.quorum.Leader	Proposing::
sessionid:0x25cd1e82c110001 type:closeSession cxid:0x0 zxid:0x200000fc4 txntype:-11 reqpath:n/a
> June 27th 2017, 22:04:39.001	INFO	org.apache.zookeeper.server.PrepRequestProcessor	Processed
session termination for sessionid: 0x25cd1e82c110001
> June 27th 2017, 22:04:39.001	DEBUG	org.apache.zookeeper.server.quorum.CommitProcessor
Processing request:: sessionid:0x25cd1e82c110001 type:closeSession cxid:0x0 zxid:0x200000fc4
txntype:-11 reqpath:n/a
> June 27th 2017, 22:05:20.324	INFO	org.apache.zookeeper.server.quorum.Learner	Revalidating
client: 0x25cd1e82c110001
> June 27th 2017, 22:05:20.324	INFO	org.apache.zookeeper.server.ZooKeeperServer	Client
attempting to renew session 0x25cd1e82c110001 at /100.96.5.6:47618
> June 27th 2017, 22:05:20.325	INFO	org.apache.zookeeper.server.ZooKeeperServer	Established
session 0x25cd1e82c110001 with negotiated timeout 12000 for client /100.96.5.6:47618
> {code}
> From ZK client's log, it was able to renew the expired session on 22:05:20.
> {code:java}
> June 27th 2017, 22:05:18.590	INFO	org.apache.zookeeper.ClientCnxn	Client session timed
out, have not heard from server in 4485ms for sessionid 0x25cd1e82c110001, closing socket
connection and attempting reconnect	0
> June 27th 2017, 22:05:18.590	WARN	org.apache.zookeeper.ClientCnxn	Client session timed
out, have not heard from server in 4485ms for sessionid 0x25cd1e82c110001	0
> June 27th 2017, 22:05:19.325	WARN	org.apache.zookeeper.ClientCnxn	SASL configuration
failed: javax.security.auth.login.LoginException: No JAAS configuration section named 'Client'
was found in specified JAAS configuration file: '/opt/confluent/etc/kafka/server_jaas.conf'.
Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server
allows it.	0
> June 27th 2017, 22:05:19.326	INFO	org.apache.zookeeper.ClientCnxn	Opening socket connection
to server 100.65.188.168/100.65.188.168:2181	0
> June 27th 2017, 22:05:20.324	INFO	org.apache.zookeeper.ClientCnxn	Socket connection established
to 100.65.188.168/100.65.188.168:2181, initiating session	0
> June 27th 2017, 22:05:20.327	INFO	org.apache.zookeeper.ClientCnxn	Session establishment
complete on server 100.65.188.168/100.65.188.168:2181, sessionid = 0x25cd1e82c110001, negotiated
timeout = 12000	0
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message