Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4A34910848 for ; Tue, 18 Feb 2014 21:11:47 +0000 (UTC) Received: (qmail 52893 invoked by uid 500); 18 Feb 2014 21:11:39 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 52704 invoked by uid 500); 18 Feb 2014 21:11:37 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 52659 invoked by uid 99); 18 Feb 2014 21:11:36 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Feb 2014 21:11:36 +0000 Date: Tue, 18 Feb 2014 21:11:36 +0000 (UTC) From: "Eric Sirianni (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-5318) Support read-only and read-write paths to shared replicas MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Sirianni updated HDFS-5318: -------------------------------- Attachment: HDFS-5318-trunkb.patch Updated patch based on Arpit's feedback. > Support read-only and read-write paths to shared replicas > --------------------------------------------------------- > > Key: HDFS-5318 > URL: https://issues.apache.org/jira/browse/HDFS-5318 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Affects Versions: 2.3.0 > Reporter: Eric Sirianni > Attachments: HDFS-5318-trunk.patch, HDFS-5318-trunkb.patch, HDFS-5318.patch, HDFS-5318a-branch-2.patch, HDFS-5318b-branch-2.patch, HDFS-5318c-branch-2.patch, hdfs-5318.pdf > > > There are several use cases for using shared-storage for datanode block storage in an HDFS environment (storing cold blocks on a NAS device, Amazon S3, etc.). > With shared-storage, there is a distinction between: > # a distinct physical copy of a block > # an access-path to that block via a datanode. > A single 'replication count' metric cannot accurately capture both aspects. However, for most of the current uses of 'replication count' in the Namenode, the "number of physical copies" aspect seems to be the appropriate semantic. > I propose altering the replication counting algorithm in the Namenode to accurately infer distinct physical copies in a shared storage environment. With HDFS-5115, a {{StorageID}} is a UUID. I propose associating some minor additional semantics to the {{StorageID}} - namely that multiple datanodes attaching to the same physical shared storage pool should report the same {{StorageID}} for that pool. A minor modification would be required in the DataNode to enable the generation of {{StorageID}} s to be pluggable behind the {{FsDatasetSpi}} interface. > With those semantics in place, the number of physical copies of a block in a shared storage environment can be calculated as the number of _distinct_ {{StorageID}} s associated with that block. > Consider the following combinations for two {{(DataNode ID, Storage ID)}} pairs {{(DN_A, S_A) (DN_B, S_B)}} for a given block B: > * {{DN_A != DN_B && S_A != S_B}} - *different* access paths to *different* physical replicas (i.e. the traditional HDFS case with local disks) > ** → Block B has {{ReplicationCount == 2}} > * {{DN_A != DN_B && S_A == S_B}} - *different* access paths to the *same* physical replica (e.g. HDFS datanodes mounting the same NAS share) > ** → Block B has {{ReplicationCount == 1}} > For example, if block B has the following location tuples: > * {{DN_1, STORAGE_A}} > * {{DN_2, STORAGE_A}} > * {{DN_3, STORAGE_B}} > * {{DN_4, STORAGE_B}}, > the effect of this proposed change would be to calculate the replication factor in the namenode as *2* instead of *4*. -- This message was sent by Atlassian JIRA (v6.1.5#6160)