spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-7481) Add Hadoop 2.6+ profile to pull in object store FS accessors
Date Wed, 16 Mar 2016 10:43:33 GMT

    [ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197165#comment-15197165
] 

Steve Loughran commented on SPARK-7481:
---------------------------------------

...thinking some more about this

How about 

# adding a {{spark-cloud}} module which, initially, does nothing but declare the dependencies
on {{hadoop-aws}}, {{hadoop-openstack}}, and on 2.7+, {{hadoop-azure}}. 
# have spark assembly declare a dependency on this module, but explicitly excluding all dependencies
other than the hadoop ones (i.e. no amazon libs, no extra httpclient ones for openstack (if
there are any), anything azure wants). If someone wants to add the relevant amazon libs, they
need to explicitly add it on the {{--jars}} option.

Doing it this way means that if a project depends on {{spark-cloud}} it gets all the cloud
dependencies that version of spark+hadoop needs.

It also provides a placeholder for explicit cloud support, specifically

- output committers that don't try to rename/assume that directory delete is atomic and O(1)
- some optional tests/examples to read/write data. 

The tests would be good not just for spark, but for catching regressions in hadoop/aws/azure
code.

If people think this is good, assign it to me and I'll look at it in april

> Add Hadoop 2.6+ profile to pull in object store FS accessors
> ------------------------------------------------------------
>
>                 Key: SPARK-7481
>                 URL: https://issues.apache.org/jira/browse/SPARK-7481
>             Project: Spark
>          Issue Type: Improvement
>          Components: Build
>    Affects Versions: 1.3.1
>            Reporter: Steve Loughran
>
> To keep the s3n classpath right, to add s3a, swift & azure, the dependencies of spark
in a 2.6+ profile need to add the relevant object store packages (hadoop-aws, hadoop-openstack,
hadoop-azure)
> this adds more stuff to the client bundle, but will mean a single spark package can talk
to all of the stores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message