hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Suresh (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-7858) Improve HA Namenode Failover detection on the client using Zookeeper
Date Fri, 27 Feb 2015 21:05:05 GMT
Arun Suresh created HDFS-7858:

             Summary: Improve HA Namenode Failover detection on the client using Zookeeper
                 Key: HDFS-7858
                 URL: https://issues.apache.org/jira/browse/HDFS-7858
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Arun Suresh
            Assignee: Arun Suresh

In an HA deployment, Clients are configured with the hostnames of both the Active and Standby
Namenodes.Clients will first try one of the NNs (non-deterministically) and if its a standby
NN, then it will respond to the client to retry the request on the other Namenode.

If the client happens to talks to the Standby first, and the standby is undergoing some GC
/ is busy, then those clients might not get a response soon enough to try the other NN.

Proposed Approach to solve this :
1) Since Zookeeper is already used as the failover controller, the clients could talk to ZK
and find out which is the active namenode before contacting it.
2) Long-lived DFSClients would have a ZK watch configured which fires when there is a failover
so they do not have to query ZK everytime to find out the active NN
2) Clients can also cache the last active NN in the user's home directory (~/.lastNN) so that
short-lived clients can try that Namenode first before querying ZK

This message was sent by Atlassian JIRA

View raw message