hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4859) Add timeout in FileJournalManager
Date Tue, 28 May 2013 18:32:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668534#comment-13668534
] 

Kihwal Lee commented on HDFS-4859:
----------------------------------

We will certainly use one of the HA-enabled journal managers in the future, but many users
I've talked to want NFS-based as a first step. Even if QJM is used for the shared edits directory,
local or NFS may still be used for storing extra copy of edits (as non-required resource).
In this case, lack of timeout in FJM can affect HA with manual failover. Can health checks
used with ZKFC detect I/O hang?
                
> Add timeout in FileJournalManager
> ---------------------------------
>
>                 Key: HDFS-4859
>                 URL: https://issues.apache.org/jira/browse/HDFS-4859
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha, namenode
>    Affects Versions: 2.0.4-alpha
>            Reporter: Kihwal Lee
>
> Due to absence of explicit timeout in FileJournalManager, error conditions that incur
long delay (usually until driver timeout) can make namenode unresponsive for long time. This
directly affects NN's failure detection latency, which is critical in HA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message