hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-4249) Add status NameNode startup to webUI
Date Tue, 26 Feb 2013 19:20:12 GMT

     [ https://issues.apache.org/jira/browse/HDFS-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Nauroth updated HDFS-4249:
--------------------------------

    Attachment: HDFS-4249-5.png
                HDFS-4249-4.png
                HDFS-4249-3.png
                HDFS-4249-2.png
                HDFS-4249-1.png

I expect to start posting patches for this feature on the sub-tasks later this week after
additional testing.  I am attaching several screenshots.

HDFS-4249-1.png: This shows a new section on dfshealth.jsp called Startup Progress.  It displays
overall elapsed time and percent complete.  Below that, the NameNode startup sequence is divided
into phases: loading fsimage, loading edits, saving a checkpoint, and safe mode.  Phases are
sub-divided into steps, which show more granular operations within each phase.  We display
counters, percent complete, and elapsed time per step, which is also aggregated at the phase
level.  Phases in progress display in italics.  Phases not yet started display in gray text.
 Note that some information typically displayed on dfshealth.jsp is missing: RPC server address,
cluster ID and block pool ID.  This is because we are starting the HTTP server before initializing
FSNamesystem and the RPC server, so that information isn't available yet.

HDFS-4249-2.png: Here we see that the saving checkpoint phase has begun.  The interesting
thing about this is that the phase is multi-threaded, one thread per dfs.namenode.name.dir,
so we see steps related to 3 different paths simultaneously, with progress tracked separately
for each one.  This can help identify if namenode startup is blocked waiting on a particularly
slow disk while saving the checkpoint.

HDFS-4249-3.png: Once the namenode reaches safe mode and the RPC server is available, we see
more of the traditional output of dfshealth.jsp.  At this point, we move the Startup Progress
section to the bottom of the page.  This keeps the focus on Cluster Summary, which is probably
more useful than Startup Progress during normal operation.

HDFS-4249-4.png: This shows the same information exposed as JSON by making an HTTP call to
a new relative URI: /startupProgress.  This supports clients such as Ambari that may want
to display the data in a different UI.

HDFS-4249-5.png: This is a jconsole screenshot showing that progress information at the phase
level is available via JMX too in a new MBean named StartupProgress.

                
> Add status NameNode startup to webUI 
> -------------------------------------
>
>                 Key: HDFS-4249
>                 URL: https://issues.apache.org/jira/browse/HDFS-4249
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>            Reporter: Suresh Srinivas
>            Assignee: Chris Nauroth
>         Attachments: HDFS-4249.1.pdf, HDFS-4249-1.png, HDFS-4249-2.png, HDFS-4249-3.png,
HDFS-4249-4.png, HDFS-4249-5.png
>
>
> Currently NameNode WebUI server starts only after the fsimage is loaded, edits are applied
and checkpoint is complete. Any status related to namenode startin up is available only in
the logs. I propose starting the webserver before loading namespace and providing namenode
startup information.
> More details in the next comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message