hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "shimingfei (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12756) Incorporate Aliyun OSS file system implementation
Date Thu, 04 Feb 2016 02:31:39 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131602#comment-15131602
] 

shimingfei commented on HADOOP-12756:
-------------------------------------

Thanks for your detailed comments Steve.
OSS is very like S3, so the testing will be similar. we already have an implementation, and
it works fine with our use cases and micro-benchmarks(sort and terasort) on both Hadoop and
spark.

You are right that the work can be packaged as an independent jar, and users' app can load
it as external library. But we think it is better to integrate it into Hadoop, as an module
under hadoop tools for maintenance and ease of use purpose.

> Incorporate Aliyun OSS file system implementation
> -------------------------------------------------
>
>                 Key: HADOOP-12756
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12756
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs
>            Reporter: shimingfei
>            Assignee: shimingfei
>         Attachments: OSS integration.pdf
>
>
> Aliyun OSS is widely used among China’s cloud users, but currently it is not easy to
access data laid on OSS storage from user’s Hadoop/Spark application, because of no original
support for OSS in Hadoop.
> This work aims to integrate Aliyun OSS with Hadoop. By simple configuration, Spark/Hadoop
applications can read/write data from OSS without any code change. Narrowing the gap between
user’s APP and data storage, like what have been done for S3 in Hadoop 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message