Return-Path: Delivered-To: apmail-hadoop-zookeeper-user-archive@minotaur.apache.org Received: (qmail 1791 invoked from network); 14 Apr 2010 22:59:40 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 14 Apr 2010 22:59:40 -0000 Received: (qmail 70540 invoked by uid 500); 14 Apr 2010 22:59:40 -0000 Delivered-To: apmail-hadoop-zookeeper-user-archive@hadoop.apache.org Received: (qmail 70494 invoked by uid 500); 14 Apr 2010 22:59:40 -0000 Mailing-List: contact zookeeper-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: zookeeper-user@hadoop.apache.org Delivered-To: mailing list zookeeper-user@hadoop.apache.org Received: (qmail 70484 invoked by uid 99); 14 Apr 2010 22:59:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Apr 2010 22:59:40 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [206.225.164.31] (HELO EXHUB020-4.exch020.serverdata.net) (206.225.164.31) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Apr 2010 22:59:34 +0000 Received: from EXVMBX020-11.exch020.serverdata.net ([169.254.2.182]) by EXHUB020-4.exch020.serverdata.net ([206.225.164.31]) with mapi; Wed, 14 Apr 2010 15:59:13 -0700 From: Charity Majors To: "zookeeper-user@hadoop.apache.org" Date: Wed, 14 Apr 2010 15:59:12 -0700 Subject: rolling upgrade 3.2.1 -> 3.3.0 Thread-Topic: rolling upgrade 3.2.1 -> 3.3.0 Thread-Index: AcrcJhlmZ2dKsHLEQr+mz7EcLsPKYA== Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Hi. I'm trying to upgrade a zookeeper cluster from 3.2.1 to 3.3.0, and hav= ing problems. I can't get a 3.3.0 node to successfully join the cluster an= d stay joined. =20 If I run zkServer.sh status immediately after starting up the newly upgrade= d node, it says the service is probably not running, and shows me this: [charity@test-zookeeper001 zookeeper-current]$ bin/zkServer.sh status JMX enabled by default Using config: /services/zookeeper/zookeeper-20100412.1/bin/../conf/zoo.cfg 2010-04-14 22:47:35,574 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:= NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:402= 87 2010-04-14 22:47:35,576 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:= NIOServerCnxn@968] - Processing stat command from /127.0.0.1:40287 2010-04-14 22:47:35,577 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:= NIOServerCnxn@606] - EndOfStreamException: Unable to read additional data f= rom client sessionid 0x0, likely client has closed socket 2010-04-14 22:47:35,578 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:= NIOServerCnxn@1286] - Closed socket connection for client /127.0.0.1:40287 = (no session established for client) Error contacting service. It is probably not running. [charity@test-zookeeper001 zookeeper-current]$ 2010-04-14 22:47:35,580 - DE= BUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1310] - ignori= ng exception during input shutdown java.net.SocketException: Transport endpoint is not connected at sun.nio.ch.SocketChannelImpl.shutdown(Native Method) at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.jav= a:640) at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) at org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnx= n.java:1306) at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.ja= va:1263) at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.jav= a:609) at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerC= nxn.java:262) If I connect with zkCli.sh, I can list the contents of zookeeper. If I mak= e changes to the schema on either of the other two nodes, test-zookeeper002= and test-zookeeper003, both of which are running 3.2.1, the changes are re= flected on test-zookeeper001, which is running 3.3.0. When I exit zkCli.sh, however, zkServer.sh status starts flapping between "= Error contacting service. It is probably not running." and "Mode: follower"= , as you can see below. =20 Any ideas? I'd really rather not have to take the production zookeeper clu= ster down to upgrade if it's not necessary. =20 Thanks, Charity. [charity@test-zookeeper001 zookeeper-current]$ bin/zkServer.sh status JMX enabled by default Using config: /services/zookeeper/zookeeper-20100412.1/bin/../conf/zoo.cfg 2010-04-14 22:53:16,848 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:= NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:552= 84 2010-04-14 22:53:16,849 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:= NIOServerCnxn@968] - Processing stat command from /127.0.0.1:55284 2010-04-14 22:53:16,849 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:= NIOServerCnxn@606] - EndOfStreamException: Unable to read additional data f= rom client sessionid 0x0, likely client has closed socket 2010-04-14 22:53:16,850 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:= NIOServerCnxn@1286] - Closed socket connection for client /127.0.0.1:55284 = (no session established for client) Error contacting service. It is probably not running. 2010-04-14 22:53:16,850 - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:= NIOServerCnxn@1310] - ignoring exception during input shutdown java.net.SocketException: Transport endpoint is not connected at sun.nio.ch.SocketChannelImpl.shutdown(Native Method) at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.jav= a:640) at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) at org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnx= n.java:1306) at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.ja= va:1263) at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.jav= a:609) at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerC= nxn.java:262) [charity@test-zookeeper001 zookeeper-current]$ bin/zkServer.sh status JMX enabled by default Using config: /services/zookeeper/zookeeper-20100412.1/bin/../conf/zoo.cfg 2010-04-14 22:53:18,908 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:= NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:552= 85 2010-04-14 22:53:18,909 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:= NIOServerCnxn@968] - Processing stat command from /127.0.0.1:55285 2010-04-14 22:53:18,909 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:= NIOServerCnxn@606] - EndOfStreamException: Unable to read additional data f= rom client sessionid 0x0, likely client has closed socket 2010-04-14 22:53:18,910 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:= NIOServerCnxn@1286] - Closed socket connection for client /127.0.0.1:55285 = (no session established for client) 2010-04-14 22:53:18,910 - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:= NIOServerCnxn@1310] - ignoring exception during input shutdown java.net.SocketException: Transport endpoint is not connected at sun.nio.ch.SocketChannelImpl.shutdown(Native Method) at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.jav= a:640) at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) at org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnx= n.java:1306) at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.ja= va:1263) at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.jav= a:609) at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerC= nxn.java:262) 2010-04-14 22:53:18,911 - ERROR [Thread-13:NIOServerCnxn$Factory$1@82] - Th= read Thread[Thread-13,5,main] died java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55= ) at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:64= ) at org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.wakeu= p(NIOServerCnxn.java:927) at org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.check= Flush(NIOServerCnxn.java:909) at org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.flush= (NIOServerCnxn.java:945) at java.io.BufferedWriter.flush(BufferedWriter.java:236) at java.io.PrintWriter.flush(PrintWriter.java:276) at org.apache.zookeeper.server.NIOServerCnxn$2.run(NIOServerCnxn.ja= va:1089) Mode: follower