hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13203) S3a: Consider reducing the number of connection aborts by setting correct length in s3 request
Date Mon, 20 Jun 2016 19:18:58 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Steve Loughran updated HADOOP-13203:
------------------------------------
    Attachment: HADOOP-13203-branch-2-006.patch

HADOOP-13203 Patch 006. Explicit policies based on fadvise terminology(); configuration file
option uses "experimental" in its name, to indicate that it is. Explicit test of random IO
shows 4x speedup over sequential IO. Also: cut back on some of the scale tests that were just
doing sequential seek+read with different readahead sizes. They don't add much, just take
up another 10-20s

> S3a: Consider reducing the number of connection aborts by setting correct length in s3
request
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-13203
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13203
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>         Attachments: HADOOP-13203-branch-2-001.patch, HADOOP-13203-branch-2-002.patch,
HADOOP-13203-branch-2-003.patch, HADOOP-13203-branch-2-004.patch, HADOOP-13203-branch-2-005.patch,
HADOOP-13203-branch-2-006.patch, stream_stats.tar.gz
>
>
> Currently file's "contentLength" is set as the "requestedStreamLen", when invoking S3AInputStream::reopen().
 As a part of lazySeek(), sometimes the stream had to be closed and reopened. But lots of
times the stream was closed with abort() causing the internal http connection to be unusable.
This incurs lots of connection establishment cost in some jobs.  It would be good to set the
correct value for the stream length to avoid connection aborts. 
> I will post the patch once aws tests passes in my machine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message