hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-2510) Add HA-related metrics
Date Tue, 07 Feb 2012 08:53:02 GMT

     [ https://issues.apache.org/jira/browse/HDFS-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aaron T. Myers updated HDFS-2510:
---------------------------------

    Attachment: HDFS-2510.HDFS-1623.patch

Here's a patch which addresses the issue. In addition to the provided test, I also tested
this manually on a cluster by hitting the /jmx URL and observing the values shown there for
the new metrics.

I implemented all the metrics above, except for the following:

bq. The difference between highest generation stamp seen from the shared edit log and the
highest generation stamp seen from any DN

I couldn't think of any legitimate use for this. It seems to serve only as a proxy for the
size of the pending DN message queues.

bq. It would probably also be useful to have a DN metric which somehow describes which active/standby
NNs its talking to, e.g. "times since last communicated with standby/active NNs."

Similarly, I couldn't think of anything useful an operator could get from this. It also doesn't
help the situation that currently all DN metrics are per-DN-daemon, not per BP offer service.
Thus, it's not obvious how to get meaningful DN-side metrics for just a single namespace.

I'm certainly open to suggestions for other metrics that people think might be useful.
                
> Add HA-related metrics
> ----------------------
>
>                 Key: HDFS-2510
>                 URL: https://issues.apache.org/jira/browse/HDFS-2510
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: data-node, ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: HDFS-2510.HDFS-1623.patch
>
>
> Off the top of my head, I can think of:
> NN metrics:
> * A binary metric for active or standby
> * The size of the pending DN message queues
> * A timestamp for when the standby NN last read from shared edit log
> * The difference between highest generation stamp seen from the shared edit log and the
highest generation stamp seen from any DN
> It would probably also be useful to have a DN metric which somehow describes which active/standby
NNs its talking to, e.g. "times since last communicated with standby/active NNs."
> I'm sure there are others as well. Comments strongly encouraged.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message