Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DAAB6184C6 for ; Wed, 3 Feb 2016 18:19:46 +0000 (UTC) Received: (qmail 34884 invoked by uid 500); 3 Feb 2016 18:19:40 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 34784 invoked by uid 500); 3 Feb 2016 18:19:40 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 34318 invoked by uid 99); 3 Feb 2016 18:19:40 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Feb 2016 18:19:40 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id EB4942C1F5C for ; Wed, 3 Feb 2016 18:19:39 +0000 (UTC) Date: Wed, 3 Feb 2016 18:19:39 +0000 (UTC) From: "Vishwajeet Dusane (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HADOOP-12666) Support Windows Azure Data Lake - as a file system in Hadoop MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vishwajeet Dusane updated HADOOP-12666: --------------------------------------- Status: In Progress (was: Patch Available) > Support Windows Azure Data Lake - as a file system in Hadoop > ------------------------------------------------------------ > > Key: HADOOP-12666 > URL: https://issues.apache.org/jira/browse/HADOOP-12666 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, fs/azure, tools > Reporter: Vishwajeet Dusane > Assignee: Vishwajeet Dusane > Attachments: HADOOP-12666-002.patch, HADOOP-12666-003.patch, HADOOP-12666-1.patch > > Original Estimate: 336h > Time Spent: 336h > Remaining Estimate: 0h > > h2. Description > This JIRA describes a new file system implementation for accessing Windows Azure Data Lake Store (ADL) from within Hadoop. This would enable existing Hadoop applications such has MR, HIVE, Hbase etc.., to use ADL store as input or output. > > ADL is ultra-high capacity, Optimized for massive throughput with rich management and security features. More details available at https://azure.microsoft.com/en-us/services/data-lake-store/ > h2. High level design > ADL file system exposes RESTful interfaces compatible with WebHdfs specification 2.7.1. > At a high level, the code here extends the SWebHdfsFileSystem class to provide an implementation for accessing ADL storage; the scheme ADL is used for accessing it over HTTPS. We use the URI scheme: > {code}adl:///path/to/file{code} > to address individual Files/Folders. Tests are implemented mostly using a Contract implementation for the ADL functionality, with an option to test against a real ADL storage if configured. > h2. Credits and history > This has been ongoing work for a while, and the early version of this work can be seen in. Credit for this work goes to the team: [~vishwajeet.dusane], [~snayak], [~srevanka], [~kiranch], [~chakrab], [~omkarksa], [~snvijaya], [~ansaiprasanna] [~jsangwan] > h2. Test > Besides Contract tests, we have used ADL as the additional file system in the current public preview release. Various different customer and test workloads have been run against clusters with such configurations for quite some time. The current version reflects to the version of the code tested and used in our production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)