zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Park (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-3280) ClientCnxn xid rollover can break sessions
Date Tue, 19 Feb 2019 21:23:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772346#comment-16772346
] 

Jonathan Park commented on ZOOKEEPER-3280:
------------------------------------------

I'm afraid I don't know enough to help out. 

> ClientCnxn xid rollover can break sessions
> ------------------------------------------
>
>                 Key: ZOOKEEPER-3280
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3280
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: java client
>    Affects Versions: 3.4.6, 3.4.12
>            Reporter: Jonathan Park
>            Priority: Major
>         Attachments: ZKOutOfOrder.java
>
>
>  
> {code:java}
> 2019-02-15 13:40:21,471 [myid:] - DEBUG [main-SendThread(localhost:2181):ClientCnxn$SendThread@759]
- Got auth sessionid:0x168f2c5e9c60017
> 2019-02-15 13:40:21,472 [myid:] - WARN  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1166]
- Session 0x168f2c5e9c60017 for server localhost/0:0:0:0:0:0:0:1:2181, unexpected error, closing
socket connection and attempting reconnect
> java.io.IOException: Xid out of order. Got Xid -3 with err 0 expected Xid -4 for a packet
with details: clientPath:null serverPath:null finished:false header:: -4,8 replyHeader:: 0,0,-4
request:: '/,F response:: v{} 
> at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:828)
> at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:94)
> at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1143)
> 2019-02-15 13:40:22,520 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1027]
- Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate
using SASL (unknown error)
> 2019-02-15 13:40:22,521 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@877]
- Socket connection established to localhost/127.0.0.1:2181, initiating session
> 2019-02-15 13:40:22,521 [myid:] - DEBUG [main-SendThread(localhost:2181):ClientCnxn$SendThread@950]
- Session establishment request sent on localhost/127.0.0.1:2181
> 2019-02-15 13:40:22,522 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1301]
- Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x168f2c5e9c60017,
negotiated timeout = 30000
> 2019-02-15 13:40:22,525 [myid:] - DEBUG [main-SendThread(localhost:2181):ClientCnxn$SendThread@742]
- Got ping response for sessionid: 0x168f2c5e9c60017 after 235329552ms
> {code}
> ClientCnxn xid's are tracked as java int's. For long-lived ZK clients this can lead to
rollover into the negative xid space. Xid = -4 is treated as a special xid reserved for auth
requests. With xid rollover, a normal ZK request can also have xid = -4 but the response will
be treated as an auth response making subsequent packet processing error with the exception
above. We can reproduce this more readily by changing the starting xid in ClientCnxn from
1 to -100. The ZK client will transparently reconnect and establish a new session but features
that depended on the same session persisting will unnecessarily experience a disconnected
event.
>  
> I've attached simple class with a main() method that reproduces the failure quickly against
a local ZK server after modifying the initial value of ClientCnxn.xid from 1 to -100. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message