hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2422) The NN should tolerate the same number of low-resource volumes as failed volumes
Date Tue, 11 Oct 2011 06:44:30 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13124752#comment-13124752
] 

Konstantin Shvachko commented on HDFS-2422:
-------------------------------------------

M.C. as far as I know this is exactly the case: the NFS drive has been soft mounted. So the
solution is either to hard mount the drive or set a large enough timeout for the soft mount.
The patch though fixes another bug, which brings NameNode into safe mode if a single drive
goes low on disk space even though there are other drives that can be used for journaling
and saving images. 
It is introduced by HDFS-1594, so I'd recommend it for inclusion to 0.23.
                
> The NN should tolerate the same number of low-resource volumes as failed volumes
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-2422
>                 URL: https://issues.apache.org/jira/browse/HDFS-2422
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.24.0
>            Reporter: Jeff Bean
>            Assignee: Aaron T. Myers
>             Fix For: 0.24.0
>
>         Attachments: HDFS-2422.patch
>
>
> We encountered a situation where the namenode dropped into safe mode after a temporary
outage of an NFS mount.
> At 12:10 the NFS server goes offline
> Oct  8 12:10:05 <namenode> kernel: nfs: server <nfs host> not responding,
timed out
> This caused the namenode to conclude resource issues:
> 2011-10-08 12:10:34,848 WARN org.apache.hadoop.hdfs.server.namenode.NameNodeResourceChecker:
Space available on volume '<nfs host>' is 0, which is below the configured reserved
amount 104857600
> Temporary loss of NFS mount shouldn't cause safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message