hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Fabbri (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13793) s3guard: add inconsistency injection, integration tests
Date Thu, 03 Nov 2016 23:37:58 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15634668#comment-15634668

Aaron Fabbri commented on HADOOP-13793:

[~stevel@apache.org], [~eddyxu] we've discussed this a bit, and [~liuml07] may be interested
as well.

One idea I have is to take advantage of the separation of the S3Client introduced in HADOOP-13447
to create a "delay wrapper" or layer on top of the AmazonS3 client that selectively delays
returning certain results.  We could call this wrapper DelayedAmazonS3 or something.  For

1. Create a file s3a://bucket/a/b/file
2. Call into DelayedAmazonS3 and tell it to delay visibility of that path for X milliseconds.
(where X is above or below the max retry threshold)
3. call listStatus(s3a://bucket/a/b/file).  DelayedAmazonS3 will not return that path in the
listing.  MetadataStore will add it to the listing though.
4. open(s3a://bucket/a/b/file).  The initial open will fail as DelayedAmazonS3 will fake a
file-not-found.  The S3A client will go into retry mode and we can assert that it is / is
not successful getting the object.  We can also assert on (to be added) FS statistics that
at least one retry happened (being mindful of timing issues that could cause flaky tests).

This is just one example but gives one of the ideas I was thinking of.  [~eddyxu] also had
an idea to use the underlying S3 bucket to do something similar.  That could be a little less
deterministic depending on how its implemented?  Any other ideas, please share.

> s3guard: add inconsistency injection, integration tests
> -------------------------------------------------------
>                 Key: HADOOP-13793
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13793
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Aaron Fabbri
> Many of us share concerns that testing the consistency features of S3Guard will be difficult
if we depend on the rare and unpredictable occurrence of actual inconsistency in S3 to exercise
those code paths.
> I think we should have a mechanism for injecting failure to force exercising of the consistency
codepaths in S3Guard.
> Requirements:
> - Integration tests that cause S3A to see the types of inconsistency we address with
> - These are deterministic integration tests.
> Unit tests are possible as well, if we were to stub out the S3Client.  That may be less
bang for the buck, though.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message