hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13650) S3Guard: Provide command line tools to manipulate metadata store.
Date Fri, 30 Dec 2016 19:38:58 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15788216#comment-15788216

Chris Nauroth commented on HADOOP-13650:

bq. Would you mind to give me advice how to generate {{libexec/tools/hadoop-distcp.sh}}?

Hello [~eddyxu].  Here is a general recipe that should work:
# Create file hadoop-aws/src/main/shellprofile.d/hadoop-aws.sh.  The code of this shell profile
would need to add hadoop-aws.jar and its dependencies to the classpath and also call something
like {{hadoop_add_subcommand "s3a" "S3A Utilities"}} and define {{function hadoop_subcommand_s3a}}.
# Update hadoop-assemblies/src/main/resources/assemblies/hadoop-tools.xml so that the assembly
copies the new file to libexec/shellprofile.d when building the distro.  You can probably
copy and adapt the XML stanza that already does it for hadoop-distcp.  This is sufficient
to land it into the distro as an optional shell profile, which individual deployments or users
would have to enable explicitly.
# To turn it into a "built-in" like DistCp, which is always on the classpath and command set
for every deployment, then you would need an additional step.  Edit hadoop-tools/hadoop-aws/pom.xml
and add a copy-dependencies execution.  You can probably copy and adapt the XML stanza that
already does it for hadoop-distcp.  Look for "tools-builtin" in hadoop-tools/hadoop-distcp/pom.xml.
 This will cause it to land in libexec/tools and make it a built-in.

However, I'm not sure we really want it to be a built-in.  Doing so would put extra jars onto
the classpath for all Hadoop deployments, even for users that won't use S3A at all.  That
would have the usual side effects of bloating the classpath and possibly causing dependency
management challenges for end users that weren't expecting to receive those jars.

> S3Guard: Provide command line tools to manipulate metadata store.
> -----------------------------------------------------------------
>                 Key: HADOOP-13650
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13650
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>         Attachments: HADOOP-13650-HADOOP-13345.000.patch, HADOOP-13650-HADOOP-13345.001.patch,
> Similar systems like EMRFS has the CLI tools to manipulate the metadata store, i.e.,
create or delete metadata store, or {{import}}, {{sync}} the file metadata between metadata
store and S3. 
> http://docs.aws.amazon.com//ElasticMapReduce/latest/ReleaseGuide/emrfs-cli-reference.html
> S3Guard should offer similar functionality. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message