hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-5399) Revisit SafeModeException and corresponding retry policies
Date Mon, 21 Oct 2013 21:41:43 GMT
Jing Zhao created HDFS-5399:

             Summary: Revisit SafeModeException and corresponding retry policies
                 Key: HDFS-5399
                 URL: https://issues.apache.org/jira/browse/HDFS-5399
             Project: Hadoop HDFS
          Issue Type: Improvement
    Affects Versions: 3.0.0
            Reporter: Jing Zhao
            Assignee: Jing Zhao

Currently for NN SafeMode, we have the following corresponding retry policies:

# In non-HA setup, the client will retry if the NN is in SafeMode. Specifically, the client
side's RPC adopts MultipleLinearRandomRetry policy if a SafeModeException is wrapped in RemoteException.
# In HA setup, the client will retry if the NN is Active and in SafeMode. Specifically, the
SafeModeException is wrapped as a RetriableException in the server side. Client side's RPC
uses FailoverOnNetworkExceptionRetry policy which recognizes RetriableException (see HDFS-5291).

There are several issues in the current implementation:
# The NN SafeMode can be a "Manual" SafeMode (i.e., started by administrator through CLI),
and the clients may not want to retry on this type of SafeMode.
# We should have a single generic strategy to address the mapping between SafeMode and retry
policy for both HA and non-HA setup. A possible straightforward solution is to always wrap
the SafeModeException in the RetriableException to indicate that the clients should retry.

This message was sent by Atlassian JIRA

View raw message