hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7331) Add Datanode network counts to datanode jmx page
Date Thu, 06 Nov 2014 23:42:36 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14201220#comment-14201220

Aaron T. Myers commented on HDFS-7331:

If I understand correctly, the metrics record failures for each DataXceiver endpoint. I think
you have the assumption that all accesses comes within the cluster. The assumption no longer
holds when the computation and storage are separated.

I've no strong opinion on this, I'm fine with using a HashMap in this jira and bounding its
size down the road if problem occurs.

Got it. I'd suggest then that we leave support for bounding the size in the patch, and make
it configurable, but default it to be unbounded. That way if it ends up causing a problem
for someone they can configure the size smaller without having to deploy a new version of
the software.

This brings up a related point - I think the current patch groups the counts by remote address,
which I believe includes the remote port. In the case of non-DN clients, this port can be
anything in the ephemeral port range, which isn't very useful. I think better would be to
group just by IP address, and not include the port.

Do folks agree with the above?

> Add Datanode network counts to datanode jmx page
> ------------------------------------------------
>                 Key: HDFS-7331
>                 URL: https://issues.apache.org/jira/browse/HDFS-7331
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Charles Lamb
>            Assignee: Charles Lamb
>            Priority: Minor
>         Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch
> Add per-datanode counts to the datanode jmx page. For example, networkErrors could be
exposed like this:
> {noformat}
>   }, {
> ...
>     "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}",
> ...
>     "NamenodeAddresses" : "{\"localhost\":\"BP-1103235125-\"}",
>     "VolumeInfo" : "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}",
>     "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e"
>   }, {
> {noformat}

This message was sent by Atlassian JIRA

View raw message