hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Milind Bhandarkar (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2422) The NN should tolerate the same number of low-resource volumes as failed volumes
Date Mon, 10 Oct 2011 23:26:29 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13124566#comment-13124566
] 

Milind Bhandarkar commented on HDFS-2422:
-----------------------------------------

I agree with the "low on space" argument by Eli.

The transient loss of connectivity to an NFS mount currently reflects as if the NFS mount
is low on space (in fact, has 0 space left). This is unfortunate. If there were a way to distinguish
between the two, (I cannot think of any, but others may have an answer), it would be ideal
to have namenode come out of safe mode automatically when the transient error goes away.
                
> The NN should tolerate the same number of low-resource volumes as failed volumes
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-2422
>                 URL: https://issues.apache.org/jira/browse/HDFS-2422
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.24.0
>            Reporter: Jeff Bean
>            Assignee: Aaron T. Myers
>         Attachments: HDFS-2422.patch
>
>
> We encountered a situation where the namenode dropped into safe mode after a temporary
outage of an NFS mount.
> At 12:10 the NFS server goes offline
> Oct  8 12:10:05 <namenode> kernel: nfs: server <nfs host> not responding,
timed out
> This caused the namenode to conclude resource issues:
> 2011-10-08 12:10:34,848 WARN org.apache.hadoop.hdfs.server.namenode.NameNodeResourceChecker:
Space available on volume '<nfs host>' is 0, which is below the configured reserved
amount 104857600
> Temporary loss of NFS mount shouldn't cause safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message