hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13876) S3Guard: better support for multi-bucket access
Date Sat, 28 Jan 2017 13:38:25 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15844062#comment-15844062

Steve Loughran commented on HADOOP-13876:

* log4j diffs are presumambly unintentional; leave them out from the commit

* Think you included your test stuff into core-site.xml. FWIW, I keep all that in auth-keys.
* in the preconditions check, you can actually use varargs error strings. e.g swap {{ "Path
'" + path + "'" + " is missing bucket.")}} to {{"Path %s is missing bucket", path)}}. This
saves on string creation when the precondition is met. Do remember to use %s instead of {}
as we do in the logs —I still get that wrong.

* The version marker test is failing, as the version key string has changed
java.lang.AssertionError: Path metadata fromfrom { Item: {parent=../VERSION, child=../VERSION,
table_version=100, table_created=0} } expected null, but was:<PathMetadata{fileStatus=S3AFileStatus{path=s3a:/../VERSION;
isDirectory=false; length=0; replication=1; blocksize=0; modification_time=0; access_time=0;
owner=alice; group=alice; permission=rw-rw-rw-; isSymlink=false} isEmptyDirectory=false}>
	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.failNotNull(Assert.java:664)
	at org.junit.Assert.assertNull(Assert.java:646)
	at org.apache.hadoop.fs.s3a.s3guard.TestPathMetadataDynamoDBTranslation.testVersionMarkerNotStatusIllegalPath(TestPathMetadataDynamoDBTranslation.java:234)

Having {{s3a:/../VERSION}} seems fine too: the main thing is to have a key which is impossible
to otherwise get into the system. What is needed now is that version key is tweaked, or the
test probe, so it works

Now, with this code, we are forever going to prevent people wiring up, "s3n" or "s3" to the
s3a schema. Is that bad? I don't really think so, we just need to be aware that it will happen.

> S3Guard: better support for multi-bucket access
> -----------------------------------------------
>                 Key: HADOOP-13876
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13876
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Aaron Fabbri
>            Assignee: Aaron Fabbri
>         Attachments: HADOOP-13876-HADOOP-13345.000.patch, HADOOP-13876-HADOOP-13345.001.patch,
> HADOOP-13449 adds support for DynamoDBMetadataStore.
> The code currently supports two options for choosing DynamoDB table names:
> 1. Use name of each s3 bucket and auto-create a DynamoDB table for each.
> 2. Configure a table name in the {{fs.s3a.s3guard.ddb.table}} parameter.
> However, if a user sets {{fs.s3a.s3guard.ddb.table}} and accesses multiple buckets, DynamoDBMetadataStore
does not properly differentiate between paths belonging to different buckets.  For example,
it would treat s3a://bucket-a/path1 as the same as s3a://bucket-b/path1.
> Goals for this JIRA:
> - Allow for a "one DynamoDB table per cluster" configuration.  If a user accesess multiple
buckets with that single table, it should work correctly.  
> - Explain which credentials are used for DynamoDB.  Currently each S3AFileSystem has
its own DynamoDBMetadataStore, which uses the credentials from the S3A fs.   We at least need
to document this behavior.
> - Document any other limitations etc. in the s3guard.md site doc.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message