hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-4103) Alert for missing blocks
Date Fri, 20 Feb 2009 01:36:01 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Raghu Angadi updated HADOOP-4103:

    Attachment: HADOOP-4103.patch

The patch for missing block alerts. A user can monitor this in multiple ways :

   # 'bin/hdfs dfsadmin -report' reports this count.
   # A warning is pasted in red on NameNode front page
   # new stat is added (for Simon, for e.g.). 
        ** Also added a stat to report size of corrupt replicas map
Once the alert is noticed, admin can run 'dfsadmin -metasave' to find out which specific blocks
are missing. 'metasave' is improved a bit to list replica info for each block in 'neededReplication'
list and the line for a missing blocks contains the word "MISSING".

This is a very non-intrusive change, thus fairly safe for backporting. No new state or data
structures for NN to track.

> Alert for missing blocks
> ------------------------
>                 Key: HADOOP-4103
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4103
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.17.2
>            Reporter: Christian Kunz
>            Assignee: Raghu Angadi
>         Attachments: HADOOP-4103.patch
> A whole bunch of datanodes became dead because of some network problems resulting in
 heartbeat timeouts although datanodes were fine.
> Many processes started to fail because of the corrupted filesystem.
> In order to catch and diagnose such problems faster the namenode should detect the corruption
automatically and provide a way to alert operations. At the minimum it should show the fact
of corruption on the GUI.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message