hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4103) Alert for missing blocks
Date Tue, 25 Nov 2008 21:41:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650746#action_12650746
] 

Raghu Angadi commented on HADOOP-4103:
--------------------------------------

I thinking of implementing a background fsck on NameNode. This will share/reuse most of the
code with current Fsck. The extra features will be to facilitate an admin to quickly check
if there something odd (e.g. ability list last 100 or so blocks in inconsistent state).

 Based on this background check there could be further improvements to monitoring more alarms
over time.. as well as reducing latency of detection.

This feature will be optional. Scan period could be around a day. 



> Alert for missing blocks
> ------------------------
>
>                 Key: HADOOP-4103
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4103
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.17.2
>            Reporter: Christian Kunz
>            Assignee: Raghu Angadi
>
> A whole bunch of datanodes became dead because of some network problems resulting in
 heartbeat timeouts although datanodes were fine.
> Many processes started to fail because of the corrupted filesystem.
> In order to catch and diagnose such problems faster the namenode should detect the corruption
automatically and provide a way to alert operations. At the minimum it should show the fact
of corruption on the GUI.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message