hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gergely Novák (JIRA) <j...@apache.org>
Subject [jira] [Commented] (HDFS-10270) TestJMXGet:testNameNode() fails
Date Mon, 11 Apr 2016 10:03:25 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234823#comment-15234823

Gergely Novák commented on HDFS-10270:

We observed that the fail is caused by this assert: 
DFSTestUtil.waitForMetric(jmx, "NumOpenConnections", numDatanodes);
This checks if the number of open connections equals to the number of data nodes. But the
number of open connections has absolutely no dependency from the data nodes: it's either 0,
1 (DataNodeProtocol) or 2 (DataNodeProtocol and ClientProtocol). The test passes in those
rare cases when the ClientProtocol hasn't timed out when the {{TestNameNode}} runs (this can
only happen if the tests are run separately). If we increment the number of data nodes (to
3, or so) the test will always fail. Contrarily if we increase the client timeout ({{ipc.client.connection.maxidletime}})
the test will always pass.

Our suggestion is to remove this assert.

> TestJMXGet:testNameNode() fails
> -------------------------------
>                 Key: HDFS-10270
>                 URL: https://issues.apache.org/jira/browse/HDFS-10270
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: test
>            Reporter: Andras Bokor
>            Priority: Minor
>         Attachments: TestJMXGet.log, TestJMXGetFails.log
> It fails with java.util.concurrent.TimeoutException. Actually the problem here is that
we expect 2 as NumOpenConnections metric but it is only 1. So the test waits 60 sec then fails.
> Please find maven output so the stack trace attached ([^TestJMXGetFails.log]).

This message was sent by Atlassian JIRA

View raw message