hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13336) S3A to support per-bucket configuration
Date Thu, 05 Jan 2017 20:58:58 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Steve Loughran updated HADOOP-13336:
    Attachment: HADOOP-13336-HADOOP-13345-001.patch

HADOOP-13336 patch 001. This adds a new BucketConfiguration class which exports some of the
classic Configuration calls, but also pulls in some of the extension methods from S3AUtils.
Moved to across s3 and s3guard. All existing tests are working without any actual changes
to specific buckets. Those tests are TODO.

Now, looking at what DFSUtils have done, I can't help thinking I've done it completely wrong.
Instead of having the look & fallback, I should just do what is done there with propagation:
just take the config for an FS and patch in all the properties from the bucket to the toplevel
values. That's harder to see what's gone wrong, but means that actually a lot of this patch
complexity isn't needed: no new type or anything. And we can add a log @ debug of what propagation
takes place.

let me do that tomorrow. At least now I know my way round what s3guard does better

> S3A to support per-bucket configuration
> ---------------------------------------
>                 Key: HADOOP-13336
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13336
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13336-HADOOP-13345-001.patch
> S3a now supports different regions, by way of declaring the endpoint —but you can't
do things like read in one region, write back in another (e.g. a distcp backup), because only
one region can be specified in a configuration.
> If s3a supported region declaration in the URL, e.g. s3a://b1.frankfurt s3a://b2.seol
, then this would be possible. 
> Swift does this with a full filesystem binding/config: endpoints, username, etc, in the
XML file. Would we need to do that much? It'd be simpler initially to use a domain suffix
of a URL to set the region of a bucket from the domain and have the aws library sort the details
out itself, maybe with some config options for working with non-AWS infra

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message