Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BFC8717B4C for ; Fri, 26 Jun 2015 14:15:10 +0000 (UTC) Received: (qmail 4007 invoked by uid 500); 26 Jun 2015 14:15:10 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 3959 invoked by uid 500); 26 Jun 2015 14:15:10 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 3949 invoked by uid 99); 26 Jun 2015 14:15:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Jun 2015 14:15:10 +0000 X-ASF-Spam-Status: No, hits=-5.0 required=5.0 tests=RCVD_IN_DNSWL_HI,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [63.226.32.83] (HELO AZ25EGS04.gdc4s.com) (63.226.32.83) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Jun 2015 14:12:55 +0000 Received: from unknown (HELO az25sec06.gddsi.com) ([10.240.16.97]) by AZ25EGS04.gdc4s.com with ESMTP; 26 Jun 2015 07:14:35 -0700 Received: from azrc4smsg20.rc4s.com (azrc4smsg20.rc4s.com [10.242.60.33]) by az25sec06.gddsi.com (Postfix) with ESMTP id 2A3151140036 for ; Fri, 26 Jun 2015 07:14:35 -0700 (MST) Received: from azrc4sazmsg10.rc4s.com ([169.254.1.27]) by azrc4smsg20.rc4s.com ([10.242.60.33]) with mapi id 14.03.0235.001; Fri, 26 Jun 2015 07:14:33 -0700 From: "Parise, Jonathan" To: "user@accumulo.apache.org" Subject: RE: MiniAccumuloClutser Unit Test Problems Thread-Topic: MiniAccumuloClutser Unit Test Problems Thread-Index: AdCwFI5icSsSQ+3jQv+5zDpYOVk20AAPnUIAAA6P+3A= Date: Fri, 26 Jun 2015 14:14:32 +0000 Message-ID: <1BCE9B5517FA914B8E0D48A05BC8602D35DBC9F0@azrc4sazmsg10.rc4s.com> References: <1BCE9B5517FA914B8E0D48A05BC8602D35DBC996@azrc4sazmsg10.rc4s.com> <558D5ACA.7080403@gmail.com> In-Reply-To: <558D5ACA.7080403@gmail.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.245.246.55] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Content-Scanned: Fidelis XPS MAILER X-Virus-Checked: Checked by ClamAV on apache.org Josh, Sorry for the confusion. I'll try and explain again with better terminology= . I have several test classes, each of which contain several test methods. Fo= r each test class I have an @BeforeClass that configures and starts a MAC. = I also have an @AfterClass that calls MAC.stop(). So the flow is MAC is created, 1-n tests in the class run against it, MAC i= s stopped, repeat for next N tests. This way several @test methods in the same class use the same MAC. I mostly= did this to make the tests run faster. I understand that having several te= st methods share a MAC could cause test state pollution, but I am careful t= o avoid that. The issues I was seeing is that if I simply ran "mvn test", the first few t= ests would pass and then eventually one of the tests would get stuck foreve= r and just keep throwing connection errors.=20 When I changed the maven configuration to force the tests to run serially, = the issue stopped occurring. I'm not sure if there tests are expected to work when run in parallel. My b= est guess is that the tests may conflict with each other over ports or some= thing like that.=20 I'd like to understand why changing the test running behavior fixed this is= sue. Also, I think it would be good to document this somewhere. The MAC jav= adoc and also the user's guide should provide details about using the MAC f= or tests and properly configuring those tests. Does that make more sense? Jon -----Original Message----- From: Josh Elser [mailto:josh.elser@gmail.com]=20 Sent: Friday, June 26, 2015 10:00 AM To: user@accumulo.apache.org Subject: Re: MiniAccumuloClutser Unit Test Problems Jonathan, If you're not seeing consistent behavior starting and stopping a MiniAccumu= loCluster repeatedly, that's a bug. If you can provide a code which shows t= his problem, that'd be a huge help. If you can get a list of the processes running when you see this happen and= cross-reference it with what processes should be running, that would also = go a long way in trying to debug this. I am a little confused to your specific situation. You said that you constr= uct and start a MAC instance in a BeforeClass and stop it in an AfterClass,= but then you said that you start and stop it for each test.=20 Are you saying that after the third construction and use of a MAC, you see = problems? Or, are you saying that you stop and start each MAC instance befo= re you run the @Test methods? - Josh Parise, Jonathan wrote: > Hello, > > I have been writing some J-unit tests based on the MiniAccumuloCluster=20 > class. I'm experiencing some issues when several of the tests run back=20 > to back. Before I get into the error, let me explain how the tests=20 > work in general. Also, I am using Accumulo 1.6.2. > > Each test has an @BeforeClass method that first creates a new random=20 > directory. Then makes a new MiniAccumuloCluster instance using that=20 > directory as the dir parameter. Then, I call MiniAccumuloCluster.start(). > > There are several @Test methods in each test class. The typical=20 > pattern for them is that they create any necessary tables, write some=20 > data into those tables and then scan to verify it was written=20 > correctly. Basically they are testing that I can serialize and=20 > deserialize various types of Objects correctly. > > Then the test class as an @AfterClass method that calls=20 > MiniAccumuloCluster.stop(). It also deletes the random directory used=20 > by the previous test. > > The issue I am running into is that generally the first test or two=20 > run fine. However, the third test usually gets stuck in the=20 > MiniAccumuloCluster startup. It just keeps complaining about being=20 > unable to connect. Note that if the test is run independently it=20 > passes just fine. When run back to back, I see errors like this one repea= tedly: > > 2015-06-26 09:10:24,352 INFO [main-SendThread(localhost:47046)] > zookeeper.ClientCnxn (ClientCnxn.java:startConnect(1058)) - Opening=20 > socket connection to server localhost/127.0.0.1:47046 > > 2015-06-26 09:10:24,353 WARN [main-SendThread(localhost:47046)] > zookeeper.ClientCnxn (ClientCnxn.java:run(1185)) - Session > 0x14e2ffc86d80004 for server null, unexpected error, closing socket=20 > connection and attempting reconnect > > java.net.ConnectException: Connection refused > > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > > at=20 > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) > > at=20 > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1143) > > 2015-06-26 09:10:24,956 INFO [main-SendThread(localhost:10406)] > zookeeper.ClientCnxn (ClientCnxn.java:startConnect(1058)) - Opening=20 > socket connection to server localhost/127.0.0.1:10406 > > 2015-06-26 09:10:24,957 WARN [main-SendThread(localhost:10406)] > zookeeper.ClientCnxn (ClientCnxn.java:run(1185)) - Session > 0x14e2ffcb2390004 for server null, unexpected error, closing socket=20 > connection and attempting reconnect > > 2015-06-26 09:10:51,764 INFO [main-SendThread(localhost:10406)] > zookeeper.ClientCnxn (ClientCnxn.java:startConnect(1058)) - Opening=20 > socket connection to server localhost/127.0.0.1:10406 > > 2015-06-26 09:10:51,765 WARN [main-SendThread(localhost:10406)] > zookeeper.ClientCnxn (ClientCnxn.java:run(1185)) - Session > 0x14e2ffcb2390004 for server null, unexpected error, closing socket=20 > connection and attempting reconnect > > java.net.ConnectException: Connection refused > > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > > at=20 > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) > > at=20 > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1143) > > 2015-06-26 09:10:52,572 INFO [main-SendThread(localhost:47046)] > zookeeper.ClientCnxn (ClientCnxn.java:startConnect(1058)) - Opening=20 > socket connection to server localhost/127.0.0.1:47046 > > 2015-06-26 09:10:52,573 WARN [main-SendThread(localhost:47046)] > zookeeper.ClientCnxn (ClientCnxn.java:run(1185)) - Session > 0x14e2ffc86d80004 for server null, unexpected error, closing socket=20 > connection and attempting reconnect > > java.net.ConnectException: Connection refused > > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > > at=20 > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) > > at=20 > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1143) > > 2015-06-26 09:10:52,890 INFO [main-SendThread(localhost:10406)] > zookeeper.ClientCnxn (ClientCnxn.java:startConnect(1058)) - Opening=20 > socket connection to server localhost/127.0.0.1:10406 > > 2015-06-26 09:10:52,891 WARN [main-SendThread(localhost:10406)] > zookeeper.ClientCnxn (ClientCnxn.java:run(1185)) - Session > 0x14e2ffcb2390004 for server null, unexpected error, closing socket=20 > connection and attempting reconnect > > java.net.ConnectException: Connection refused > > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > > at=20 > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) > > at=20 > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1143) > > 2015-06-26 09:10:54,191 INFO [main-SendThread(localhost:10406)] > zookeeper.ClientCnxn (ClientCnxn.java:startConnect(1058)) - Opening=20 > socket connection to server localhost/127.0.0.1:10406 > > 2015-06-26 09:10:54,192 WARN [main-SendThread(localhost:10406)] > zookeeper.ClientCnxn (ClientCnxn.java:run(1185)) - Session > 0x14e2ffcb2390004 for server null, unexpected error, closing socket=20 > connection and attempting reconnect > > java.net.ConnectException: Connection refused > > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > > at=20 > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) > > at=20 > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1143) > > 2015-06-26 09:10:54,471 INFO [main-SendThread(localhost:47046)] > zookeeper.ClientCnxn (ClientCnxn.java:startConnect(1058)) - Opening=20 > socket connection to server localhost/127.0.0.1:47046 > > 2015-06-26 09:10:54,471 WARN [main-SendThread(localhost:47046)] > zookeeper.ClientCnxn (ClientCnxn.java:run(1185)) - Session > 0x14e2ffc86d80004 for server null, unexpected error, closing socket=20 > connection and attempting reconnect > > In the ZooKeeperServerMain.out I see lines like this repeating several=20 > thousand times: > > 2015-06-26 08:39:44,810 INFO [SyncThread:0] server.NIOServerCnxn > (NIOServerCnxn.java:finishSessionInit(1580)) - Established session > 0x14e2fe179eb0000 with negotiated timeout 30000 for client=20 > /127.0.0.1:36311 > > 2015-06-26 08:39:45,278 INFO [ProcessThread:-1]=20 > server.PrepRequestProcessor (PrepRequestProcessor.java:pRequest(419))=20 > - Got user-level KeeperException when processing > sessionid:0x14e2fe179eb0000 type:create cxid:0x31=20 > zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error=20 > Path:/accumulo/87e85c1f-eb40-4695-b11d-67ed88586648/tables/+r/conf > Error:KeeperErrorCode =3D NodeExists for=20 > /accumulo/87e85c1f-eb40-4695-b11d-67ed88586648/tables/+r/conf > > 2015-06-26 08:39:45,300 INFO [ProcessThread:-1]=20 > server.PrepRequestProcessor (PrepRequestProcessor.java:pRequest(419))=20 > - Got user-level KeeperException when processing > sessionid:0x14e2fe179eb0000 type:create cxid:0x33=20 > zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error=20 > Path:/accumulo/87e85c1f-eb40-4695-b11d-67ed88586648/tables/!0/conf > Error:KeeperErrorCode =3D NodeExists for=20 > /accumulo/87e85c1f-eb40-4695-b11d-67ed88586648/tables/!0/conf > > I've thought about putting a Thread.sleep() call after the > MiniAccumuloCluster.stop() call, but that certainly seems brittle. I'm=20 > not sure if that would improve the situation. > > It seems to me that the MiniAccumuloCluster does not behave well when=20 > instances are started and stopped several times. I am running the=20 > tests through Maven using the default test behavior. > > Could it be something with Maven? Maybe I need to be more explicit=20 > when telling it how to run the tests? > > Anyone have any insight into what is going wrong here? In general is=20 > my usage pattern correct for MiniAccumuloCluster? > > Thanks, > > Jon Parise > > Senior Software Engineer > > Viz | General Dynamics Missons Systems > > *This message and/or attachments may include information subject to GD=20 > Corporate Policies 07-103 and 07-105 and is intended to be accessed=20 > only by authorized recipients. Use, storage and transmission are=20 > governed by General Dynamics and its policies. Contractual=20 > restrictions apply to third parties. Recipients should refer to the=20 > policies or contract to determine proper handling. Unauthorized=20 > review, use, disclosure or distribution is prohibited. If you are not=20 > an intended recipient, please contact the sender and destroy all=20 > copies of the original message.* >