hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10610) Upgrade S3n s3.fs.buffer.dir to suppoer multi directories
Date Fri, 16 May 2014 11:22:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998569#comment-13998569

Steve Loughran commented on HADOOP-10610:

The overall concept makes sense

# {{LocalDirAllocator}} contains the logic to worry about file capacity, writeability &c.
The more disks you list, the more likely a disk is to fail, the more you need that code. This
patch currently just bails out if one dest dir isn't there, even if others may be present.
# This would be an ideal time to move {{"fs.s3.buffer.dir"}} from an inline string to a constant
where it can be referred to in hadoop and external code

Recommended tests 
# a test with > 1 directory in the args
# basic handling of "erroneous" inputs. e.g trailing commas in options
# bad directories in input paths

> Upgrade S3n s3.fs.buffer.dir to suppoer multi directories
> ---------------------------------------------------------
>                 Key: HADOOP-10610
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10610
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 2.4.0
>            Reporter: Ted Malaska
>            Assignee: Ted Malaska
>            Priority: Minor
>         Attachments: HDFS-6383.patch
> s3.fs.buffer.dir defines the tmp folder where files will be written to before getting
sent to S3.  Right now this is limited to a single folder which causes to major issues.
> 1. You need a drive with enough space to store all the tmp files at once
> 2. You are limited to the IO speeds of a single drive
> This solution will resolve both and has been tested to increase the S3 write speed by
2.5x with 10 mappers on hs1.

This message was sent by Atlassian JIRA

View raw message