hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lei (Eddy) Xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13650) S3Guard: Provide command line tools to manipulate metadata store.
Date Fri, 06 Jan 2017 11:41:58 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15804365#comment-15804365

Lei (Eddy) Xu commented on HADOOP-13650:

bq. DDB table region is always the s3 bucket region for simplicity.

Yes, I agree this.

bq. The general usage pattern is to specify the fs.defaultFS as s3://mybucket alike:

I was thinking the use cases such as using Hadoop to run ETL , which takes S3 as input and
output locations, as what AWS EMR does.  In such case, the computing cluster (i.e., Hadoop
/ Hive / Spark) here should set {{fs.defaultFS}} to the NameNode, because ETL pipelines use
this HDFS cluster instead of S3 to store intermediate data. 

In the current {{DynamoDBMetadataStore#initialize(Configuration)}},  such case will raise
{{Exception}} if we do not explicitly specify s3a URI in the CLI because it can not create

bq. we rely on the endpoint for determining the DDB region.

I am fine with that. 

> S3Guard: Provide command line tools to manipulate metadata store.
> -----------------------------------------------------------------
>                 Key: HADOOP-13650
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13650
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>         Attachments: HADOOP-13650-HADOOP-13345.000.patch, HADOOP-13650-HADOOP-13345.001.patch,
HADOOP-13650-HADOOP-13345.002.patch, HADOOP-13650-HADOOP-13345.003.patch
> Similar systems like EMRFS has the CLI tools to manipulate the metadata store, i.e.,
create or delete metadata store, or {{import}}, {{sync}} the file metadata between metadata
store and S3. 
> http://docs.aws.amazon.com//ElasticMapReduce/latest/ReleaseGuide/emrfs-cli-reference.html
> S3Guard should offer similar functionality. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message