hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13786) Add S3Guard committer for zero-rename commits to S3 endpoints
Date Thu, 12 Oct 2017 20:36:00 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Steve Loughran updated HADOOP-13786:
------------------------------------
    Attachment: HADOOP-13786-039.patch

Patch 039

I've not pushed up a patch for a while, but been doing it on github with Ewan Higgs testing
it too. I can now confirm that I have, locally, finished off the hardest bit of the downstream
tests: *Spark outputting partitioned Parquet data.* 

With that working I'm happy there's no fundamental flaws in the design and in how it is bonding
to committers.

I want to get this in now; it's pretty much done other than docs. Reviews welcome!

Testing: s3 ireland with/without dynamo, scale on. All is well with the committers
* one transient s3guard CLI; filed separately
* one 500 server error on a getFileStatusV1 call which the new retry logic didn't handle.
There's no retry policy on a 500 request right now...we may want to evolve one. But that's
related to error handling, not the commit code itself.

> Add S3Guard committer for zero-rename commits to S3 endpoints
> -------------------------------------------------------------
>
>                 Key: HADOOP-13786
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13786
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3
>    Affects Versions: 3.0.0-beta1
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13786-036.patch, HADOOP-13786-037.patch, HADOOP-13786-038.patch,
HADOOP-13786-039.patch, HADOOP-13786-HADOOP-13345-001.patch, HADOOP-13786-HADOOP-13345-002.patch,
HADOOP-13786-HADOOP-13345-003.patch, HADOOP-13786-HADOOP-13345-004.patch, HADOOP-13786-HADOOP-13345-005.patch,
HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-007.patch,
HADOOP-13786-HADOOP-13345-009.patch, HADOOP-13786-HADOOP-13345-010.patch, HADOOP-13786-HADOOP-13345-011.patch,
HADOOP-13786-HADOOP-13345-012.patch, HADOOP-13786-HADOOP-13345-013.patch, HADOOP-13786-HADOOP-13345-015.patch,
HADOOP-13786-HADOOP-13345-016.patch, HADOOP-13786-HADOOP-13345-017.patch, HADOOP-13786-HADOOP-13345-018.patch,
HADOOP-13786-HADOOP-13345-019.patch, HADOOP-13786-HADOOP-13345-020.patch, HADOOP-13786-HADOOP-13345-021.patch,
HADOOP-13786-HADOOP-13345-022.patch, HADOOP-13786-HADOOP-13345-023.patch, HADOOP-13786-HADOOP-13345-024.patch,
HADOOP-13786-HADOOP-13345-025.patch, HADOOP-13786-HADOOP-13345-026.patch, HADOOP-13786-HADOOP-13345-027.patch,
HADOOP-13786-HADOOP-13345-028.patch, HADOOP-13786-HADOOP-13345-028.patch, HADOOP-13786-HADOOP-13345-029.patch,
HADOOP-13786-HADOOP-13345-030.patch, HADOOP-13786-HADOOP-13345-031.patch, HADOOP-13786-HADOOP-13345-032.patch,
HADOOP-13786-HADOOP-13345-033.patch, HADOOP-13786-HADOOP-13345-035.patch, cloud-intergration-test-failure.log,
objectstore.pdf, s3committer-master.zip
>
>
> A goal of this code is "support O(1) commits to S3 repositories in the presence of failures".
Implement it, including whatever is needed to demonstrate the correctness of the algorithm.
(that is, assuming that s3guard provides a consistent view of the presence/absence of blobs,
show that we can commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output streams (ie.
not visible until the close()), if we need to use that to allow us to abort commit operations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message