zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Han (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-1441) Some test cases are failing because Port bind issue.
Date Sat, 10 Nov 2018 01:11:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682124#comment-16682124

Michael Han commented on ZOOKEEPER-1441:

PortAssignment itself is fine and if everyone is using it, they should not get conflicts because
PortAssignment is the single source of truth of port allocation. However, the problem here
is not every processes running on test machine using PortAssignment, despite most, if not
all of ZK unit tests do use it. So if there are heavy workloads running on the test machine
while ZK unit tests were running, potential port conflicts would occur.

>> I never actually got why PortAssigment tries to bind the port before returns

What PortAssignment implemented is a "reserve and release" pattern for port allocation, and
this is better than "choose a port but not reserver" approach, because it is very unlikely
the OS, regardless of how it allocates actual ports to the processes, will yield two consecutive
port for two socket bind calls. Thus, by creating the socket via bind, and the immediately
close it, we buy us sometime during which OS will not reuse this same socket for a successive
socket call. This time however varies, thus there could be race conditions that by the time
we actually going to bind this port again, it's already grabbed by another process. For ZK
server, it requires an unbinded port number pass to it (otherwise it can't bind the port),
but due to the same race condition it's possible when the server tries to bind, the port was
taken already. The only way to guarantee atomicity in this case is to have ZK server asking
a port from OS and bind immediately.

> Some test cases are failing because Port bind issue.
> ----------------------------------------------------
>                 Key: ZOOKEEPER-1441
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1441
>             Project: ZooKeeper
>          Issue Type: Test
>          Components: server, tests
>            Reporter: kavita sharma
>            Assignee: Michael Han
>            Priority: Major
>              Labels: flaky, flaky-test
> very frequently testcases are failing because of :
> java.net.BindException: Address already in use
> 	at sun.nio.ch.Net.bind(Native Method)
> 	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
> 	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
> 	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
> 	at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:111)
> 	at org.apache.zookeeper.server.ServerCnxnFactory.createFactory(ServerCnxnFactory.java:112)
> 	at org.apache.zookeeper.server.quorum.QuorumPeer.<init>(QuorumPeer.java:514)
> 	at org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:156)
> 	at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103)
> 	at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67)
> may be because of Port Assignment so please give me some suggestions if someone is also
facing same problem.

This message was sent by Atlassian JIRA

View raw message