hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13421) Switch to v2 of the S3 List Objects API in S3A
Date Mon, 04 Sep 2017 14:39:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16152669#comment-16152669

Steve Loughran commented on HADOOP-13421:

Nothing wrong with getting things utterly wrong...it's what the tests are there to catch.

The specific test you mention went in on all the listing rework...too much code to test. 
See also HADOOP-10714 . If you play with some of the scale test options you can create larger
directory sets too

> Switch to v2 of the S3 List Objects API in S3A
> ----------------------------------------------
>                 Key: HADOOP-13421
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13421
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Steven K. Wong
>            Assignee: Aaron Fabbri
>            Priority: Minor
>         Attachments: HADOOP-13421-HADOOP-13345.001.patch
> Unlike [version 1|http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html]
of the S3 List Objects API, [version 2|http://docs.aws.amazon.com/AmazonS3/latest/API/v2-RESTBucketGET.html]
by default does not fetch object owner information, which S3A doesn't need anyway. By switching
to v2, there will be less data to transfer/process. Also, it should be more robust when listing
a versioned bucket with "a large number of delete markers" ([according to AWS|https://aws.amazon.com/releasenotes/Java/0735652458007581]).
> Methods in S3AFileSystem that use this API include:
> * getFileStatus(Path)
> * innerDelete(Path, boolean)
> * innerListStatus(Path)
> * innerRename(Path, Path)
> Requires AWS SDK 1.10.75 or later.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message