hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: Errors during shutdown/startup of ZooKeeper
Date Tue, 02 Jun 2009 22:09:08 GMT
This log manifests if the client is running ahead of the server.

say you have:
1) client connects to server A and sees some changes
2) client gets disconnected from A and attempts to connect to B
3) B can be running behind A by some number of changes (it will 
eventually catch up)
4) client will attempt to connect to another server that's at, or ahead 
of it's zxid until successful.

why? this ensures that the client never sees old data, part of the 
guarantee you are provided when using zk. However since servers in a 
quorum can run behind (minority) then you might see this.

It's unusual to see this so many times however. I see that you are 
running this as part of a junit test. Perhaps that has some impact? Are 
you shutting down servers, perhaps clearing the datadir and restarting 
them, w/o closing all of the clients? If your tests are not running in 
"fork mode" for junit (or multiple tests w/in a junit test class) then 
old clients can hang around _if not explicitly closed_ and try to 
re-connect to new servers that you are using for new tests - if the 
servers are starting fresh (zxid=1) then you can see this alot as the 
old (zombie) clients cannot connect to the new servers. Perhaps this is 
what you are seeing?

Patrick

Nitay wrote:
> I see. That helps. However, even as warnings, these go on seemingly
> endlessly. Why do they not get fixed by themselves? What are we doing wrong
> here?
> 
> On Tue, Jun 2, 2009 at 2:24 PM, Mahadev Konar <mahadev@yahoo-inc.com> wrote:
> 
>> Hi Nitay,
>>  This is not an error but should be a warning. I have opened up a jira for
>> it.
>>
>> http://issues.apache.org/jira/browse/ZOOKEEPER-428
>>
>>
>> The message just says that a client is connecting to a server that is
>> behind
>> that a server is was connected to earlier. The log should be warn and not
>> error and should be fixed in the next release.
>>
>> mahadev
>>
>> On 6/2/09 2:12 PM, "Nitay" <nitayj@gmail.com> wrote:
>>
>>> Hey guys,
>>>
>>> We are getting a lot of messages like this in HBase:
>>>
>>> [junit] 2009-06-02 11:57:23,658 ERROR [NIOServerCxn.Factory:21810]
>>> server.NIOServerCnxn(514): Client has seen zxid 0xe our last zxid is 0xd
>>>
>>> For more context, the block it usually appears in is:
>>>
>>>     [junit] 2009-06-02 13:27:54,083 INFO  [main-SendThread]
>>> zookeeper.ClientCnxn$SendThread(737): Priming connection to
>>> java.nio.channels.SocketChannel[connected local=/0:0:0:0:0:0:0:1%0:56511
>>> remote=localhost/0:0:0:0:0:0:0:1:21810]
>>>     [junit] 2009-06-02 13:27:54,084 INFO  [main-SendThread]
>>> zookeeper.ClientCnxn$SendThread(889): Server connection successful
>>>     [junit] 2009-06-02 13:27:54,093 INFO  [NIOServerCxn.Factory:21810]
>>> server.NIOServerCnxn(532): Connected to /0:0:0:0:0:0:0:1%0:56511 lastZxid
>> 16
>>>     [junit] 2009-06-02 13:27:54,094 ERROR [NIOServerCxn.Factory:21810]
>>> server.NIOServerCnxn(543): Client has seen zxid 0x10 our last zxid is 0x4
>>>     [junit] 2009-06-02 13:27:54,094 WARN  [NIOServerCxn.Factory:21810]
>>> server.NIOServerCnxn(444): Exception causing close of session 0x0 due to
>>> java.io.IOException: Client has seen zxid 0x10 our last zxid is 0x4
>>>     [junit] 2009-06-02 13:27:54,094 DEBUG [NIOServerCxn.Facto777ry:21810]
>>> server.NIOServerCnxn(447): IOException stack trace
>>>     [junit] java.io.IOException: Client has seen zxid 0x10 our last zxid
>> is
>>> 0x4
>>>     [junit]     at
>>>
>> org.apache.zookeeper.server.NIOServerCnxn.readConnectRequest(NIOServerCnxn.jav
>>> a:544)
>>>     [junit]     at
>>> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:331)
>>>     [junit]     at
>>>
>> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:176)
>>>     [junit] 2009-06-02 13:27:54,094 INFO  [NIOServerCxn.Factory:21810]
>>> server.NIOServerCnxn(777): closing session:0x0 NIOServerCnxn:
>>> java.nio.channels.SocketChannel[connected local=/0:0:0:0:0:0:0:1%0:21810
>>> remote=/0:0:0:0:0:0:0:1%0:56511]
>>>     [junit] 2009-06-02 13:27:54,097 WARN  [main-SendThread]
>>> zookeeper.ClientCnxn$SendThread(919): Exception closing session
>>> 0x121a2a7c43a0002 to sun.nio.ch.SelectionKeyImpl@2c662b4e
>>>     [junit] java.io.IOException: Read error rc = -1
>>> java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
>>>     [junit]     at
>>> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653)
>>>     [junit]     at
>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
>>>     [junit] 2009-06-02 13:27:54,097 WARN  [main-SendThread]
>>> zookeeper.ClientCnxn$SendThread(953): Ignoring exception during shutdown
>>> input
>>>     [junit] java.net.SocketException: Socket is not connected
>>>     [junit]     at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>>>     [junit]     at
>>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
>>>     [junit]     at
>>> sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>>     [junit]     at
>>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:951)
>>>     [junit]     at
>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
>>>
>>>
>>> This happens in a seemingly endless loop. We are not quite sure what it
>>> means. Can someone help shed some light on these messages?
>>>
>>> Thanks,
>>> -n
>>
> 

Mime
View raw message