From dev-return-75733-archive-asf-public=cust-asf.ponee.io@zookeeper.apache.org Sat Nov 10 02:11:05 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 4535D180627 for ; Sat, 10 Nov 2018 02:11:05 +0100 (CET) Received: (qmail 76533 invoked by uid 500); 10 Nov 2018 01:11:04 -0000 Mailing-List: contact dev-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list dev@zookeeper.apache.org Received: (qmail 76522 invoked by uid 99); 10 Nov 2018 01:11:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 10 Nov 2018 01:11:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 17DEBC0417 for ; Sat, 10 Nov 2018 01:11:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.501 X-Spam-Level: X-Spam-Status: No, score=-109.501 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id A4QkebXh9RLW for ; Sat, 10 Nov 2018 01:11:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 8BD4761F5F for ; Sat, 10 Nov 2018 01:11:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id BBF2AE0E3E for ; Sat, 10 Nov 2018 01:11:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 56EBD266DA for ; Sat, 10 Nov 2018 01:11:00 +0000 (UTC) Date: Sat, 10 Nov 2018 01:11:00 +0000 (UTC) From: "Michael Han (JIRA)" To: dev@zookeeper.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ZOOKEEPER-1441) Some test cases are failing because Port bind issue. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ZOOKEEPER-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682124#comment-16682124 ] Michael Han commented on ZOOKEEPER-1441: ---------------------------------------- PortAssignment itself is fine and if everyone is using it, they should not get conflicts because PortAssignment is the single source of truth of port allocation. However, the problem here is not every processes running on test machine using PortAssignment, despite most, if not all of ZK unit tests do use it. So if there are heavy workloads running on the test machine while ZK unit tests were running, potential port conflicts would occur. >> I never actually got why PortAssigment tries to bind the port before returns What PortAssignment implemented is a "reserve and release" pattern for port allocation, and this is better than "choose a port but not reserver" approach, because it is very unlikely the OS, regardless of how it allocates actual ports to the processes, will yield two consecutive port for two socket bind calls. Thus, by creating the socket via bind, and the immediately close it, we buy us sometime during which OS will not reuse this same socket for a successive socket call. This time however varies, thus there could be race conditions that by the time we actually going to bind this port again, it's already grabbed by another process. For ZK server, it requires an unbinded port number pass to it (otherwise it can't bind the port), but due to the same race condition it's possible when the server tries to bind, the port was taken already. The only way to guarantee atomicity in this case is to have ZK server asking a port from OS and bind immediately. > Some test cases are failing because Port bind issue. > ---------------------------------------------------- > > Key: ZOOKEEPER-1441 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1441 > Project: ZooKeeper > Issue Type: Test > Components: server, tests > Reporter: kavita sharma > Assignee: Michael Han > Priority: Major > Labels: flaky, flaky-test > > very frequently testcases are failing because of : > java.net.BindException: Address already in use > at sun.nio.ch.Net.bind(Native Method) > at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) > at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:111) > at org.apache.zookeeper.server.ServerCnxnFactory.createFactory(ServerCnxnFactory.java:112) > at org.apache.zookeeper.server.quorum.QuorumPeer.(QuorumPeer.java:514) > at org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:156) > at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103) > at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67) > may be because of Port Assignment so please give me some suggestions if someone is also facing same problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)