Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D466A200CF4 for ; Sun, 3 Sep 2017 23:21:08 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D2F04163DF9; Sun, 3 Sep 2017 21:21:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id F3A7A163DF8 for ; Sun, 3 Sep 2017 23:21:07 +0200 (CEST) Received: (qmail 33510 invoked by uid 500); 3 Sep 2017 21:21:07 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 33498 invoked by uid 99); 3 Sep 2017 21:21:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 03 Sep 2017 21:21:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 87A95C6989 for ; Sun, 3 Sep 2017 21:21:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id angUgFqOvLCQ for ; Sun, 3 Sep 2017 21:21:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 83F965FB17 for ; Sun, 3 Sep 2017 21:21:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id D76E5E002C for ; Sun, 3 Sep 2017 21:21:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 9106824147 for ; Sun, 3 Sep 2017 21:21:00 +0000 (UTC) Date: Sun, 3 Sep 2017 21:21:00 +0000 (UTC) From: "Chris Douglas (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HADOOP-12077) Provide a multi-URI replication Inode for ViewFs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sun, 03 Sep 2017 21:21:09 -0000 [ https://issues.apache.org/jira/browse/HADOOP-12077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151974#comment-16151974 ] Chris Douglas commented on HADOOP-12077: ---------------------------------------- I went through the test failures; they appear unrelated. The javac warning is [blocked|https://issues.apache.org/jira/browse/HADOOP-12077?focusedCommentId=16113734&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16113734] on related JIRAs. I can fix the checkstyle warnings before commit. If there are no objections, I'll commit this in a day or two. > Provide a multi-URI replication Inode for ViewFs > ------------------------------------------------ > > Key: HADOOP-12077 > URL: https://issues.apache.org/jira/browse/HADOOP-12077 > Project: Hadoop Common > Issue Type: New Feature > Components: fs > Reporter: Gera Shegalov > Assignee: Gera Shegalov > Attachments: HADOOP-12077.001.patch, HADOOP-12077.002.patch, HADOOP-12077.003.patch, HADOOP-12077.004.patch, HADOOP-12077.005.patch, HADOOP-12077.006.patch, HADOOP-12077.007.patch, HADOOP-12077.008.patch, HADOOP-12077.009.patch, HADOOP-12077.010.patch > > > This JIRA is to provide simple "replication" capabilities for applications that maintain logically equivalent paths in multiple locations for caching or failover (e.g., S3 and HDFS). We noticed a simple common HDFS usage pattern in our applications. They host their data on some logical cluster C. There are corresponding HDFS clusters in multiple datacenters. When the application runs in DC1, it prefers to read from C in DC1, and the applications prefers to failover to C in DC2 if the application is migrated to DC2 or when C in DC1 is unavailable. New application data versions are created periodically/relatively infrequently. > In order to address many common scenarios in a general fashion, and to avoid unnecessary code duplication, we implement this functionality in ViewFs (our default FileSystem spanning all clusters in all datacenters) in a project code-named Nfly (N as in N datacenters). Currently each ViewFs Inode points to a single URI via ChRootedFileSystem. Consequently, we introduce a new type of links that points to a list of URIs that are each going to be wrapped in ChRootedFileSystem. A typical usage: /nfly/C/user->/DC1/C/user,/DC2/C/user,... This collection of ChRootedFileSystem instances is fronted by the Nfly filesystem object that is actually used for the mount point/Inode. Nfly filesystems backs a single logical path /nfly/C/user//path by multiple physical paths. > Nfly filesystem supports setting minReplication. As long as the number of URIs on which an update has succeeded is greater than or equal to minReplication exceptions are only logged but not thrown. Each update operation is currently executed serially (client-bandwidth driven parallelism will be added later). > A file create/write: > # Creates a temporary invisible _nfly_tmp_file in the intended chrooted filesystem. > # Returns a FSDataOutputStream that wraps output streams returned by 1 > # All writes are forwarded to each output stream. > # On close of stream created by 2, all n streams are closed, and the files are renamed from _nfly_tmp_file to file. All files receive the same mtime corresponding to the client system time as of beginning of this step. > # If at least minReplication destinations has gone through steps 1-4 without failures the transaction is considered logically committed, otherwise a best-effort attempt of cleaning up the temporary files is attempted. > As for reads, we support a notion of locality similar to HDFS /DC/rack/node. We sort Inode URIs using NetworkTopology by their authorities. These are typically host names in simple HDFS URIs. If the authority is missing as is the case with the local file:/// the local host name is assumed InetAddress.getLocalHost(). This makes sure that the local file system is always the closest one to the reader in this approach. For our Hadoop 2 hdfs URIs that are based on nameservice ids instead of hostnames it is very easy to adjust the topology script since our nameservice ids already contain the datacenter. As for rack and node we can simply output any string such as /DC/rack-nsid/node-nsid, since we only care about datacenter-locality for such filesystem clients. > There are 2 policies/additions to the read call path that makes it more expensive, but improve user experience: > - readMostRecent - when this policy is enabled, Nfly first checks mtime for the path under all URIs, sorts them from most recent to least recent. Nfly then sorts the set of most recent URIs topologically in the same manner as described above. > - repairOnRead - when readMostRecent is enabled Nfly already has to RPC all underlying destinations. With repairOnRead, Nfly filesystem would additionally attempt to refresh destinations with the path missing or a stale version of the path using the nearest available most recent destination. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: common-issues-help@hadoop.apache.org