hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vishwajeet Dusane (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-12666) Support Microsoft Azure Data Lake - as a file system in Hadoop
Date Tue, 12 Apr 2016 12:14:25 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vishwajeet Dusane updated HADOOP-12666:
---------------------------------------
    Attachment: HADOOP-12666-010.patch

- Removed Asynchronous append operation.
- For version tracking, added usage of Hadoop {{VersionInfo}}. 
- Reduced buffer caching size from 16MB to 8 MB. Observed during few HBase runs like snapshot
creation. Aggressive buffering does not help due to  random positional reads in the file.
- Removed {{LOG_VERSION}} usage since telemetry is not part of this patch set anyway,
- Added a check for stream closed status for {{BatchAppendOutputStream}}.

> Support Microsoft Azure Data Lake - as a file system in Hadoop
> --------------------------------------------------------------
>
>                 Key: HADOOP-12666
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12666
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs, fs/azure, tools
>            Reporter: Vishwajeet Dusane
>            Assignee: Vishwajeet Dusane
>         Attachments: Create_Read_Hadoop_Adl_Store_Semantics.pdf, HADOOP-12666-002.patch,
HADOOP-12666-003.patch, HADOOP-12666-004.patch, HADOOP-12666-005.patch, HADOOP-12666-006.patch,
HADOOP-12666-007.patch, HADOOP-12666-008.patch, HADOOP-12666-009.patch, HADOOP-12666-010.patch,
HADOOP-12666-1.patch
>
>   Original Estimate: 336h
>          Time Spent: 336h
>  Remaining Estimate: 0h
>
> h2. Description
> This JIRA describes a new file system implementation for accessing Microsoft Azure Data
Lake Store (ADL) from within Hadoop. This would enable existing Hadoop applications such has
MR, HIVE, Hbase etc..,  to use ADL store as input or output.
>  
> ADL is ultra-high capacity, Optimized for massive throughput with rich management and
security features. More details available at https://azure.microsoft.com/en-us/services/data-lake-store/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message