hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yongjun Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6054) MiniQJMHACluster should not use static port to avoid binding failure in unit test
Date Wed, 04 Feb 2015 03:38:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304602#comment-14304602

Yongjun Zhang commented on HDFS-6054:

Hi [~brandonli],

Thanks for reporting the issue, and thanks [~kihwal] for pointing me to this jira, after we
saw the failure HDFS-7707 test. Hope you don't mind I assign it to myself.

I took a look, found the following:
* There is a retrying mechanism in MiniQJMHACluster#MiniQJMHACluster(Builder builder) to find
available ports
* There is a bug in there when incrementing retryCount, if there is exception thrown due to
BindException, the retryCount won't be incremented
*  In {{TestFailureToReadEdits#setUpCluster}}, there are two branches, one create cluster
for SHARED_DIR_HA mode, and the other create cluster for QJM_HA mode. The QJM_HA branch uses
the existing retrying mechanism; the SHARED_DIR_HA branch is where the failure reported in
this jira happens, because it doesn't retry.

I'm attaching patch rev 001 to fix the retryCount bug, and also a retry machanism in the SHARED_DIR_HA

Hi [~kihwal] and [~brandonli], wonder if you could help doing a review?

Thanks a lot.

> MiniQJMHACluster should not use static port to avoid binding failure in unit test
> ---------------------------------------------------------------------------------
>                 Key: HDFS-6054
>                 URL: https://issues.apache.org/jira/browse/HDFS-6054
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Brandon Li
>            Assignee: Yongjun Zhang
> One example of the test failues: TestFailureToReadEdits
> {noformat}
> Error Message
> Port in use: localhost:10003
> Stacktrace
> java.net.BindException: Port in use: localhost:10003
> 	at sun.nio.ch.Net.bind(Native Method)
> 	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
> 	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
> 	at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
> 	at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:845)
> 	at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:786)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:132)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:593)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:492)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:650)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:635)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1283)
> 	at org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:966)
> 	at org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:851)
> 	at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:697)
> 	at org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:374)
> 	at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:355)
> 	at org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits.setUpCluster(TestFailureToReadEdits.java:108)
> {noformat}

This message was sent by Atlassian JIRA

View raw message