hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13655) document object store use with fs shell and distcp
Date Wed, 28 Sep 2016 04:08:20 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15528312#comment-15528312
] 

ASF GitHub Bot commented on HADOOP-13655:
-----------------------------------------

Github user yuanboliu commented on a diff in the pull request:

    https://github.com/apache/hadoop/pull/131#discussion_r80839511
  
    --- Diff: hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md ---
    @@ -729,3 +757,278 @@ usage
     Usage: `hadoop fs -usage command`
     
     Return the help for an individual command.
    +
    +
    +<a name="ObjectStores" />Working with Object Storage
    +====================================================
    +
    +The Hadoop FileSystem shell works with Object Stores such as Amazon S3, 
    +Azure WASB and OpenStack Swift.
    +
    +
    +
    +```bash
    +# Create a directory
    +hadoop fs -mkdir s3a://bucket/datasets/
    +
    +# Upload a file from the cluster filesystem
    +hadoop fs -put /datasets/example.orc s3a://bucket/datasets/
    +
    +# touch a file
    +hadoop fs -touchz wasb://yourcontainer@youraccount.blob.core.windows.net/touched
    +```
    +
    +Unlike a normal filesystem, renaming files and directories in an object store
    +usually takes time proportional to the size of the objects being manipulated.
    +As many of the filesystem shell operations
    +use renaming as the final stage in operations, skipping that stage
    +can avoid long delays.
    + 
    +In particular, the `put` and `copyFromLocal` commands should
    +both have the `-d` options set for a direct upload.
    +
    +
    +```bash
    +# Upload a file from the cluster filesystem
    +hadoop fs -put -d /datasets/example.orc s3a://bucket/datasets/
    +
    +# Upload a file from the local filesystem
    +hadoop fs -copyFromLocal -d -f ~/datasets/devices.orc s3a://bucket/datasets/
    +
    +# create a file from stdin
    +echo "hello" | hadoop fs -put -d -f - wasb://yourcontainer@youraccount.blob.core.windows.net/hello.txt
    --- End diff --
    
    `hadoop fs -put -d -f - wasb:` should be `hadoop fs -put -d -f wasb:`


> document object store use with fs shell and distcp
> --------------------------------------------------
>
>                 Key: HADOOP-13655
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13655
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: documentation, fs, fs/s3
>    Affects Versions: 2.7.3
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>
> There's no specific docs for working with object stores from the {{hadoop fs}} shell
or in distcp; people either suffer from this (performance, billing), or learn through trial
and error what to do.
> Add a section in both fs shell and distcp docs covering use with object stores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message