From dev-return-78438-archive-asf-public=cust-asf.ponee.io@hbase.apache.org Tue Mar 24 05:22:05 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 0A81C18064F for ; Tue, 24 Mar 2020 06:22:04 +0100 (CET) Received: (qmail 68221 invoked by uid 500); 24 Mar 2020 05:22:02 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 68063 invoked by uid 99); 24 Mar 2020 05:22:01 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Mar 2020 05:22:01 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id CB017E2D8D for ; Tue, 24 Mar 2020 05:22:00 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 3C8EC7806B8 for ; Tue, 24 Mar 2020 05:22:00 +0000 (UTC) Date: Tue, 24 Mar 2020 05:22:00 +0000 (UTC) From: "Michael Stack (Jira)" To: dev@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (HBASE-23919) [Flakey Test] Standalone Zookeeper won't start (minizookeepercluster won't come up) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-23919?page=3Dcom.atlassi= an.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Stack resolved HBASE-23919. ----------------------------------- Resolution: Not A Problem HBASE-23993 fixed this condition > [Flakey Test] Standalone Zookeeper won't start (minizookeepercluster won'= t come up) > -------------------------------------------------------------------------= ---------- > > Key: HBASE-23919 > URL: https://issues.apache.org/jira/browse/HBASE-23919 > Project: HBase > Issue Type: Bug > Components: flakies > Reporter: Michael Stack > Priority: Major > > I've seen this on occasion across different hardwares; the standalone zoo= keeper won't come up and then random unit test fails in its junit startup p= hase as part of launching mini cluster. > I've been trying to track this w/ a while now adding in logging, using mo= re of zk client instead of the copy/paste we've had a long while now, and a= dding in logging (I've been running locally w/ zk logging set to INFO). > It looks like this currently where the last thing out of the server launc= h is this.... > {code:java} > 2020-03-02 07:57:46,129 INFO [Time-limited test] server.ZooKeeperServe= r(854): maxSessionTimeout set to -1 = = 2020-03-02 07:57:46,139 INFO [Time-limited test] server.NIOServer= CnxnFactory(89): binding to port 0.0.0.0/0.0.0.0:49316 = = 2020-03-02 07:57:46,181 INFO [Time-limited test] zookeeper.M= iniZooKeeperCluster(256): Started connectionTimeout=3D30000, dir=3D/Users/s= tack/checkouts/hbase.git/hbase-server/target/test-data/89d57393-fe97-9200-6= 30f-7843ee406bd2/ cluster_6b4d6f67-7978-dc67-a1a3-7b1b0c0e4268/zookeep= er_0, clientPort=3D49316, dataDir=3D/Users/stack/checkouts/hbase.git/hbase-= server/target/test-data/89d57393-fe97-9200-630f-7843ee406bd2/cluster_6b4d6f= 67-7978-dc67-a1a3-7b1b0c0e4268/ zookeeper_0/version-2, dataLogDir=3D/Users/= stack/checkouts/hbase.git/hbase-server/target/test-data/89d57393-fe97-9200-= 630f-7843ee406bd2/cluster_6b4d6f67-7978-dc67-a1a3-7b1b0c0e4268/zookeeper_0/= version-2, tickTime=3D2000, maxClientCnxns=3D300, minSessionTi= meout=3D4000, maxSessionTimeout=3D40000, serverId=3D0{code} > ... then the client just does this over and over: > {code:java} > 2020-03-02 07:57:46,182 INFO [Time-limited test] client.FourLetterWordM= ain(65): connecting to localhost 49316 = = 2020-03-02 07:57:46,213 INFO [Time-limited test] zookeeper.MiniZoo= KeeperCluster(453): localhost:49316 not up = = java.net.SocketException: Connection reset = = = at java.net.SocketInputStream.read(SocketInputStream.ja= va:209) = = at java.net.SocketInputStream.read(SocketInputStre= am.java:141) = = at sun.nio.cs.StreamDecoder.readBytes(StreamD= ecoder.java:284) = = at sun.nio.cs.StreamDecoder.implRead(Str= eamDecoder.java:326) = = at sun.nio.cs.StreamDecoder.read(St= reamDecoder.java:178) = = at java.io.InputStreamReader.r= ead(InputStreamReader.java:184) = = at java.io.BufferedReader= .fill(BufferedReader.java:161) = = at java.io.BufferedR= eader.readLine(BufferedReader.java:324) = = at java.io.Buff= eredReader.readLine(BufferedReader.java:389) = = at org.apa= che.zookeeper.client.FourLetterWordMain.send4LetterWord(FourLetterWordMain.= java:84) = at or= g.apache.hadoop.hbase.zookeeper.MiniZooKeeperCluster.waitForServerUp(MiniZo= oKeeperCluster.java:442) = = at org.apache.hadoop.hbase.zookeeper.MiniZooKeeperCluster.startup(MiniZooKe= eperCluster.java:259) = = at org.apache.hadoop.hbase.HBaseZKTestingUtility.startMiniZKCluster(HB= aseZKTestingUtility.java:130) = = at org.apache.hadoop.hbase.HBaseZKTestingUtility.startMiniZKClust= er(HBaseZKTestingUtility.java:103) = = at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniClus= ter(HBaseTestingUtility.java:1107) = = at org.apache.hadoop.hbase.HBaseTestingUtility.startMin= iCluster(HBaseTestingUtility.java:1065) = = at org.apache.hadoop.hbase.regionserver.TestRegion= ReplicasWithModifyTable.before(TestRegionReplicasWithModifyTable.java:62) = = at sun.reflect.NativeMethodAccessorImpl.invok= e0(Native Method) = = at sun.reflect.NativeMethodAccessorImpl.= invoke(NativeMethodAccessorImpl.java:62) = = at sun.reflect.DelegatingMethodAcce= ssorImpl.invoke(DelegatingMethodAccessorImpl.java:43) = = at java.lang.reflect.Method.in= voke(Method.java:498) = = at org.junit.runners.mode= l.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) = = at org.junit.interna= l.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) = = at org.junit.ru= nners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) = = at org.jun= it.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33) = = at or= g.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)= = = at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:= 27) = = at org.junit.internal.runners.statements.FailOnTimeout$CallableStateme= nt.call(FailOnTimeout.java:288) = = at org.junit.internal.runners.statements.FailOnTimeout$CallableSt= atement.call(FailOnTimeout.java:282) = = at java.util.concurrent.FutureTask.run(FutureTask.java:266) = = = at java.lang.Thread.run(Thread.java:745) > {code} > =C2=A0 > There is ZOOKEEPER-2714 but it is closed as not-reproducible and it looks= a little different going by the log emissions in that it registers the cli= ent connections. > Noting this here in case others have seen similar or there are ideas out = there for how to fix. > Here for example is how the failure might look high-level: > {code:java} > ------------------------------------------------------------------------= ------- > Test set: org.apache.hadoop.hbase.regionserver.TestRegionReplicasWithMod= ifyTable > ------------------------------------------------------------------------= ------- > Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.017 s = <<< FAILURE! - in org.apache.hadoop.hbase.regionserver.TestRegionReplicasWi= thModifyTable > org.apache.hadoop.hbase.regionserver.TestRegionReplicasWithModifyTable = Time elapsed: 0.01 s <<< ERROR! > java.io.IOException: Waiting for startup of standalone server; server is= Running=3Dtrue > at=20 > org.apache.hadoop.hbase.regionserver.TestRegionReplicasWithModifyTable.be= fore(TestRegionReplicasWithModifyTable.java:62) {code} > ... and then when you dig in, the test never even made it past cluster la= unch in setup.=C2=A0 -- This message was sent by Atlassian Jira (v8.3.4#803005)