hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Badger (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-10755) TestDecommissioningStatus BindException Failure
Date Fri, 12 Aug 2016 13:35:20 GMT

     [ https://issues.apache.org/jira/browse/HDFS-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Eric Badger updated HDFS-10755:
    Attachment: HDFS-10755.002.patch

Attaching patch to address the checkstyle comments. Both of the test failures seem unrelated
and they did not fail locally when I ran them with this patch.

> TestDecommissioningStatus BindException Failure
> -----------------------------------------------
>                 Key: HDFS-10755
>                 URL: https://issues.apache.org/jira/browse/HDFS-10755
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Eric Badger
>            Assignee: Eric Badger
>         Attachments: HDFS-10755.001.patch, HDFS-10755.002.patch
> Tests in TestDecomissioningStatus call MiniDFSCluster.dataNodeRestart(). They are required
to come back up on the same (initially ephemeral) port that they were on before being shutdown.
Because of this, there is an inherent race condition where another process could bind to the
port while the datanode is down. If this happens then we get a BindException failure. However,
all of the tests in TestDecommissioningStatus depend on the cluster being up and running for
them to run correctly. So if a test blows up the cluster, the subsequent tests will also fail.
Below I show the BindException failure as well as the subsequent test failure that occurred.
> {noformat}
> java.net.BindException: Problem binding to [localhost:35370] java.net.BindException:
Address already in use; For more details see:  http://wiki.apache.org/hadoop/BindException
> 	at sun.nio.ch.Net.bind0(Native Method)
> 	at sun.nio.ch.Net.bind(Net.java:436)
> 	at sun.nio.ch.Net.bind(Net.java:428)
> 	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
> 	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
> 	at org.apache.hadoop.ipc.Server.bind(Server.java:430)
> 	at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:768)
> 	at org.apache.hadoop.ipc.Server.<init>(Server.java:2391)
> 	at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:951)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:523)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:498)
> 	at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:796)
> 	at org.apache.hadoop.hdfs.server.datanode.DataNode.initIpcServer(DataNode.java:802)
> 	at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1134)
> 	at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:429)
> 	at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2387)
> 	at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2274)
> 	at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2321)
> 	at org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2037)
> 	at org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus.testDecommissionDeadDN(TestDecommissioningStatus.java:426)
> {noformat}
> {noformat}
> java.lang.AssertionError: Number of Datanodes  expected:<2> but was:<1>
> 	at org.junit.Assert.fail(Assert.java:88)
> 	at org.junit.Assert.failNotEquals(Assert.java:743)
> 	at org.junit.Assert.assertEquals(Assert.java:118)
> 	at org.junit.Assert.assertEquals(Assert.java:555)
> 	at org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus.testDecommissionStatus(TestDecommissioningStatus.java:275)
> {noformat}
> I don't think there's any way to avoid the inherent race condition with getting the same
ephemeral port, but we can definitely fix the tests so that it doesn't cause subsequent tests
to fail. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message