hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Fabbri (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.
Date Wed, 30 Nov 2016 02:01:02 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707232#comment-15707232

Aaron Fabbri commented on HADOOP-13449:

Congrats on the clean jenkins run [~liuml07].

I'm running some tests with this configured as the MetadataStore today.  First thing I'm noticing
is a failure in {{TestS3AGetFileStatus}}.  This led me to this part of the v10 patch:

+public class MockS3ClientFactory extends Configured implements S3ClientFactory {
+  @Override
+  public AmazonDynamoDBClient createDynamoDBClient(
+      URI uri, com.amazonaws.regions.Region region) throws IOException {
+    final DefaultS3ClientFactory factory = new DefaultS3ClientFactory();
+    factory.setConf(getConf());
+    return factory.createDynamoDBClient(uri, region);
+  }

I believe the goal of the mock s3 client is to be able to run non-integration (unit) tests
without S3 configured.  It looks like you are creating an actual S3 client from the mock client.
 Doesn't this break the ability of unit tests to run without S3?

It seems like all the unit tests should only use MetadataStores which can run locally (Null
or LocalMetadataStore).  So, maybe we do not need this code at all.  Maybe MockS3ClientFactory#createDynamoDBClient()
just throws a runtime exception "Failing to create DynamoDB for unit test", and then we fall
back to the NullMetadataStore automatically in S3Guard#getMetadataStore()?

I'm also wondering if, instead of having S3ClientFactory expose a createDynamoDBClient() method,
we should just add getters to S3ClientFactory (getAwsConfig() and maybe getCredentials()),
and then move createDynamoDBClient() to inside the DynamoDBMetadataStore? The DynamoDBMetadataStore
can then call the getters on the client to get what it needs to construct a DynamoDB client.
 The goal here would be to keep dynamodb specifics encapsulated in DynamoDBMetadataStore.
 This would allow, for example, removing the Dynamo dependency from s3a if we ever want to
create a separate submodule for the DynamoDBMetadataStore.

> S3Guard: Implement DynamoDBMetadataStore.
> -----------------------------------------
>                 Key: HADOOP-13449
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13449
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Chris Nauroth
>            Assignee: Mingliang Liu
>         Attachments: HADOOP-13449-HADOOP-13345.000.patch, HADOOP-13449-HADOOP-13345.001.patch,
HADOOP-13449-HADOOP-13345.002.patch, HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch,
HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, HADOOP-13449-HADOOP-13345.007.patch,
HADOOP-13449-HADOOP-13345.008.patch, HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch
> Provide an implementation of the metadata store backed by DynamoDB.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message