hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David S. Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6383) Upgrade S3n s3.fs.buffer.dir to suppoer multi directories
Date Wed, 14 May 2014 02:42:14 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997211#comment-13997211

David S. Wang commented on HDFS-6383:

Thanks Ted for the patch.

We should probably use the LocalDirAllocator like what s3a uses.  That seems to be the proper
way to do this in Hadoop.

> Upgrade S3n s3.fs.buffer.dir to suppoer multi directories
> ---------------------------------------------------------
>                 Key: HDFS-6383
>                 URL: https://issues.apache.org/jira/browse/HDFS-6383
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.4.0
>            Reporter: Ted Malaska
>            Assignee: Ted Malaska
>            Priority: Minor
>         Attachments: HDFS-6383.patch
> s3.fs.buffer.dir defines the tmp folder where files will be written to before getting
sent to S3.  Right now this is limited to a single folder which causes to major issues.
> 1. You need a drive with enough space to store all the tmp files at once
> 2. You are limited to the IO speeds of a single drive
> This solution will resolve both and has been tested to increase the S3 write speed by
2.5x with 10 mappers on hs1.

This message was sent by Atlassian JIRA

View raw message