hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13786) Add S3Guard committer for zero-rename commits to S3 endpoints
Date Wed, 13 Sep 2017 16:28:03 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Steve Loughran updated HADOOP-13786:
    Attachment: HADOOP-13786-037.patch

HADOOP-13786 HADOOP-14531 lambda wrapper around all production s3 calls
* all invocations of s3 calls are wrapped where appropriate, either with once() (which does
the translation), retry() or retryUntranslated
* javadocs state retry policy; this is propagated to give callers an idea of what retries
* commit tests -> java 8 lambdas too
* test json serdeser in hadoop common
* checkstyle

the error handling includes improvement to translateexception to recognise dynamoDB throttling
and also that json parse error which means an EOF on response parsing (which means, as its
after-execution, that non-idempotent calls wont retry).

The commit methods have been resilient to failures via the S3Lambda for a while, now that
it's extended to all of them we can add methods to do fault injection on all operations: the
retly logic in S3ARetryPolicy assumes that throttling (503), server error (500) and connection
setup failures are always retryable. Therefore, if the client code is done right, you could
run all the system tests with the injecting client set to throttle a limited percent of time.

Although developed in the committer, we could tease this out (along with the moved WriteOperationsHelper)
and add it to trunk standalone. That'd reduce the size of the HADOOP-13786 diff, and provide
a single large patch for people to cherry pick. Though if they want to backport to branch-2
they get to convert every single lambda-exp into a callable, which, even though IDEA Can automate,
will make for uglier code than:

    S3Object object = invoke.retry(text, uri, true,
        () -> client.getObject(request));

Anyway, I plan to continue with dev & test of the error handling in the committer branch,
which, after all, depends on resilience of all its operations, even in the presence of transient
failures. Once its stable it'd be something to pull out and get in standalone

> Add S3Guard committer for zero-rename commits to S3 endpoints
> -------------------------------------------------------------
>                 Key: HADOOP-13786
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13786
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3
>    Affects Versions: 3.0.0-beta1
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: cloud-intergration-test-failure.log, HADOOP-13786-036.patch, HADOOP-13786-037.patch,
HADOOP-13786-HADOOP-13345-001.patch, HADOOP-13786-HADOOP-13345-002.patch, HADOOP-13786-HADOOP-13345-003.patch,
HADOOP-13786-HADOOP-13345-004.patch, HADOOP-13786-HADOOP-13345-005.patch, HADOOP-13786-HADOOP-13345-006.patch,
HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-007.patch, HADOOP-13786-HADOOP-13345-009.patch,
HADOOP-13786-HADOOP-13345-010.patch, HADOOP-13786-HADOOP-13345-011.patch, HADOOP-13786-HADOOP-13345-012.patch,
HADOOP-13786-HADOOP-13345-013.patch, HADOOP-13786-HADOOP-13345-015.patch, HADOOP-13786-HADOOP-13345-016.patch,
HADOOP-13786-HADOOP-13345-017.patch, HADOOP-13786-HADOOP-13345-018.patch, HADOOP-13786-HADOOP-13345-019.patch,
HADOOP-13786-HADOOP-13345-020.patch, HADOOP-13786-HADOOP-13345-021.patch, HADOOP-13786-HADOOP-13345-022.patch,
HADOOP-13786-HADOOP-13345-023.patch, HADOOP-13786-HADOOP-13345-024.patch, HADOOP-13786-HADOOP-13345-025.patch,
HADOOP-13786-HADOOP-13345-026.patch, HADOOP-13786-HADOOP-13345-027.patch, HADOOP-13786-HADOOP-13345-028.patch,
HADOOP-13786-HADOOP-13345-028.patch, HADOOP-13786-HADOOP-13345-029.patch, HADOOP-13786-HADOOP-13345-030.patch,
HADOOP-13786-HADOOP-13345-031.patch, HADOOP-13786-HADOOP-13345-032.patch, HADOOP-13786-HADOOP-13345-033.patch,
HADOOP-13786-HADOOP-13345-035.patch, objectstore.pdf, s3committer-master.zip
> A goal of this code is "support O(1) commits to S3 repositories in the presence of failures".
Implement it, including whatever is needed to demonstrate the correctness of the algorithm.
(that is, assuming that s3guard provides a consistent view of the presence/absence of blobs,
show that we can commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output streams (ie.
not visible until the close()), if we need to use that to allow us to abort commit operations.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message