hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mingliang Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14090) Allow users to specify region for DynamoDB table instead of endpoint
Date Tue, 28 Feb 2017 05:30:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15887305#comment-15887305
] 

Mingliang Liu commented on HADOOP-14090:
----------------------------------------

Sorry for late chime in.

I'm thinking about this again (read all the threads and got your points) because I'm refactoring
the {{DynamoDBClientFactory}} by replacing {{new AmazonDynamoDBClient}} (deprecated use pattern
according to AWS doc) with {{AmazonDynamoDBClientBuilder}}. The {{AmazonDynamoDBClientBuilder}}
can be configured either region, or endpoint configuration (mutually exclusive); while the
endpoint configuration have both region and endpoint provided. From this I think it's recommended
to prefer region to endpoint configuration. Meanwhile if we use endpoint configuration, we
will have to provide the associated region as well which we don't. We don't have this problem
in our existing code because when we have endpoint configured, the region will be inferred
from it in {{new AmazonDynamoDBClient}}.

For this we have a few options. I think it's better to make region a config, not the endpoint.
The endpoint will only be used for DynamoDBLocal in unit test. We can have new DynamoDBClientFactory
implementation to use endpoint; the endpoint config key is not needed any more (having both
is indeed fragile and confusing). For the user, the DDB region will be (in order): 1) DDB
region in config, or 2) S3 bucket location. If neither of them is provided, that indicates
an error in {{DefaultDynamoDBClientFactory}}.

I can imagine a few of benefits:
# Region is simpler than endpoint for users
# {{AmazonDynamoDBClientBuilder}} is simpler (or users will have to provide both region and
endpoint)
# [HADOOP-14023] will hopefully be simpler
# [HADOOP-14027] will hopefully be simpler

I also uploaded a patch in [HADOOP-14130] so if that makes sense, this JIRA will be partially
resolved as "fixed".

> Allow users to specify region for DynamoDB table instead of endpoint
> --------------------------------------------------------------------
>
>                 Key: HADOOP-14090
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14090
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>         Attachments: HADOOP-14090-HADOOP-13345.001.patch
>
>
> Assuming the AWS SDK allows this, I think this would be a better way to configure it
for any usage on AWS itself (with endpoint still being an option for AWS-compatible non-AWS
use cases). Unless users actually care about a specific endpoint, this is easier. Perhaps
less important, HADOOP-14023 shows that inferring the region from the endpoint (which granted,
isn't that necessary) doesn't work very well at all.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message