hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Nauroth <cnaur...@hortonworks.com>
Subject Re: S3AFileSystem & read-after-write consistency
Date Tue, 18 Oct 2016 19:34:21 GMT
Hello Dave,

You are correct that S3A currently may suffer unexpected effects from eventual consistency
due to negative caching on the S3 side for the initial HEAD request.  In practice, I have
never seen any negative consequences from this particular aspect of S3 eventual consistency,
but in theory the problem is possible.

If you are interested in mitigating the effects of S3 eventual consistency, then you might
be interested in watching development of the S3Guard project, tracked in Apache JIRA HADOOP-13345.

https://issues.apache.org/jira/browse/HADOOP-13345

To summarize, we plan to support use of an external store with strong consistency guarantees
for S3A file system metadata.  In the interaction you described, we could consult the consistent
metadata store instead of sending a HEAD request to S3 to determine if the object already
exists.

--Chris Nauroth

From: Dave Maughan <davidamaughan@gmail.com>
Date: Thursday, October 6, 2016 at 4:07 AM
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Subject: S3AFileSystem & read-after-write consistency

Hi,

I'm investigating S3's read-after-write consistency model with S3AFileSystem and something
is not quite clear to me, so I'm hoping someone more knowledgeable can clarify it for me.

Amazon state (http://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html):

    "Amazon S3 provides read-after-write consistency for PUTS of new objects in your S3 bucket
in all regions with one caveat. The caveat is that if you make a HEAD or GET request to the
key name (to find if the object exists) before creating the object, Amazon S3 provides eventual
consistency for read-after-write".

In S3FileSystem, create -> exists -> getFileStatus -> AmazonS3Client.getObjectMetadata
(HEAD).

Does this mean that currently, S3AFileSystem cannot take advantage of S3's read-after-write
consistency?

Thanks
- Dave

Mime
View raw message