hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Fabbri (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-15400) Improve S3Guard documentation on Authoritative Mode implementation
Date Thu, 19 Apr 2018 21:18:01 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-15400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aaron Fabbri updated HADOOP-15400:
----------------------------------
    Summary: Improve S3Guard documentation on Authoritative Mode implementation  (was: Improve
S3Guard documentation on Authoritative Mode implemenation)

> Improve S3Guard documentation on Authoritative Mode implementation
> ------------------------------------------------------------------
>
>                 Key: HADOOP-15400
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15400
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 3.0.1
>            Reporter: Aaron Fabbri
>            Assignee: Gabor Bota
>            Priority: Minor
>
> Part of the design of S3Guard is support for skipping the call to S3 listObjects and
serving directory listings out of the MetadataStore under certain circumstances.  This feature
is called "authoritative" mode.  I've talked to many people about this feature and it seems
to be universally confusing.
> I suggest we improve / add a section to the s3guard.md site docs elaborating on what
Authoritative Mode is.
> It is *not* treating the MetadataStore (e.g. dynamodb) as the source of truth in general.
> It *is* the ability to short-circuit S3 list objects and serve listings from the MetadataStore
in some circumstances: 
> For S3A to skip S3's list objects on some *path*, and serve it directly from the MetadataStore,
the following things must all be true:
>  # The MetadataStore implementation persists the bit {{DirListingMetadata.isAuthorititative}}
set when calling {{MetadataStore#put(DirListingMetadata)}}
>  # The S3A client is configured to allow metadatastore to be authoritative source of
a directory listing (fs.s3a.metadatastore.authoritative=true).
>  # The MetadataStore has a full listing for *path* stored in it.  This only happens
if the FS client (s3a) explicitly has stored a full directory listing with {{DirListingMetadata.isAuthorititative=true}}
before the said listing request happens.
> Note that #1 only currently happens in LocalMetadataStore. Adding support to DynamoDBMetadataStore
is covered in HADOOP-14154.
> Also, the multiple uses of the word "authoritative" are confusing. Two meanings are used:
>  1. In the FS client configuration fs.s3a.metadatastore.authoritative
>  - Behavior of S3A code (not MetadataStore)
>  - "S3A is allowed to skip S3.list() when it has full listing from MetadataStore"
> 2. MetadataStore
>  When storing a dir listing, can set a bit isAuthoritative
>  1 : "full contents of directory"
>  0 : "may not be full listing"
> Note that a MetadataStore *MAY* persist this bit. (not *MUST*).
> We should probably rename the {{DirListingMetadata.isAuthorititative}} to {{.fullListing}}
or at least put a comment where it is used to clarify its meaning.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message