hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonghwan Kim (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-4945) A Distributed and Cooperative NameNode Cluster for a Highly-Available HDFS
Date Sun, 30 Jun 2013 03:38:19 GMT
Yonghwan Kim created HDFS-4945:

             Summary: A Distributed and Cooperative NameNode Cluster for a Highly-Available
                 Key: HDFS-4945
                 URL: https://issues.apache.org/jira/browse/HDFS-4945
             Project: Hadoop HDFS
          Issue Type: New Feature
          Components: auto-failover
    Affects Versions: HA branch (HDFS-1623)
            Reporter: Yonghwan Kim

Recently, Hadoop attracts much attention of engineers and researchers as an emerging and effective
framework for Big Data.
HDFS(Hadoop Distributed File System) can manage huge amount of data with guaranteeing high
performance and reliability 
with only commodity hardware. 

However, HDFS requires a single master node, called NameNode, to manage the entire namespace
(or all the i-nodes) 
of a file system. This causes SPOF (Single Point Of Failure) problem because the file system
becomes inaccessible 
when the NameNode fails. (HDFS-2064)

This also causes a bottleneck of efficiency since all the access requests to the file system
have to contact the 
NameNode. Hadoop 2.0 resolves the SPOF problem by introducing manual failover based on two
NameNodes, Active and Standby.
However, it still has the efficiency bottleneck problem since all the access requests have
to contact the Active 
in ordinary executions. It may also lose an advantage of using commodity hardware since the
two NameNodes have to 
share a highly-reliable sophisticated storage.

We here propose a new HDFS architecture to resolve all the problems mentioned above.
The proposed architecture has the following features and advantages.

1. Multiple NameNodes (not restricted to two) can be utilized to improve availability.  
The entire namespace of a file system is partitioned into several fragments, and replicas
of each fragment are 
dispersed among the NameNodes.  When each fragment has k replicas, the file system can tolerate
up to 
floor(k/2 - 1) faulty NameNodes.

2. Multiple NameNodes can be utilized to improve performance. The performance bottleneck caused
by a single 
NameNode can be circumvented by assigning different NameNodes to different fragments as the
primary ones 
(or the entry points).

3. The highly-reliable storage shared by the NameNodes is removed by introducing message-based
mechanism among the NameNodes.  The architecture requires only commodity hardware.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message