hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1432) HDFS across data centers: HighTide
Date Thu, 30 Sep 2010 12:52:37 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916458#action_12916458

dhruba borthakur commented on HDFS-1432:

The goal of the HighTideNode is to keep only one physical replica per data center. This is
mostly for older files that change very infrequently.The HighTide server watches over the
two HDFS namespaces from two different NameNodes in two different data centers. These two
equivalent namespaces will be populated via means that are external to HighTide. The HighTide
server verifies (via checksum of the crc files) that two directories in the two HDFS contain
identical data, and if so, reduces the replication factor to 2 on both HDFS. (One or both
HDFS could be using HDFS-RAID too).The HighTideNode monitors any missing replicas on both
namenode, and if it finds any it will fix by copying data from the other namenode in the remote
data center.

In short, the replication within a HDFS cluster will  occur via the NameNode as usual. Each
NameNode will maintain fewer than 3 copies of the data. The replication across HDFS clusters
will be coordinated by the HighTideNode. It invokes the -list-corruptFiles RPC to each NameNode
periodically (every minute) to detect missing replicas.

*DataNodeGateway*:I envision a single HighTideNode coordinating replication between multiple
HDFS clusters. An alternative  approach would be to do some sort of a GateWay approach: a
specialized DataNode that exports the DataNode protocol and appears like a huge-big DataNode
to a HDFS cluster, but instead of storing blocks on local disks, the GatewayDataNode would
store data in a remote HDFS cluster. This is similar to existing NFS Gateways, e.g. NFS-CIFS
interaction. The downside is that this design is more complex and intrusive to HDFS rather
than being a layer on top of it.

*Mean-Time-To-Recover (MTR)*: Will this approach of having remote replicas increase the probability
of data loss? My claim is that we should try to keep the MTR practically the same as it is
today. If all the replicas of a block on HDFS1 goes missing, then the HighTideNode will first
increase the replication factor of the equivalent file in HDFS2. This ensures that we get
back to 3-overall copies as soon as possible, thus keeping the MTR same as it is now. Then
the HighTideNode will copy over this block from HDFS2 to HDFS1, wait for HDFS1 to attain a
replica count of 2 before decreasing the replica count on HDFS2 from 3 back to 2.

> HDFS across data centers: HighTide
> ----------------------------------
>                 Key: HDFS-1432
>                 URL: https://issues.apache.org/jira/browse/HDFS-1432
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
> There are many instances when the same piece of data resides on multiple HDFS clusters
in different data centers.  The primary reason being that the physical limitation of one data
center is insufficient to host the entire data set. In that case, the administrator(s) typically
partition that data into two  (or more) HDFS clusters on two different data centers and then
duplicates some subset of that data into both the HDFS clusters.
> In such a situation, there will be six physical copies of data that is duplicated, three
copies in one data center and another three copies in another data center. It would be nice
if we can keep fewer than 3 replicas on each of the data centers and have the ability to fix
a replica in the local data center by copying data from the remote copy in the remote data

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message