hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-12319) S3AFastOutputStream has no ability to apply backpressure
Date Wed, 17 Aug 2016 19:32:21 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Steve Loughran updated HADOOP-12319:
    Fix Version/s: 2.8.0

> S3AFastOutputStream has no ability to apply backpressure
> --------------------------------------------------------
>                 Key: HADOOP-12319
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12319
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 2.7.0
>            Reporter: Colin Marc
>            Priority: Critical
>             Fix For: 2.8.0
> Currently, users of S3AFastOutputStream can control memory usage with a few settings:
{{fs.s3a.threads.core,max}}, which control the number of active uploads (specifically as arguments
to a {{ThreadPoolExecutor}}), and {{fs.s3a.max.total.tasks}}, which controls the size of the
feeding queue for the {{ThreadPoolExecutor}}.
> However, a user can get an almost *guaranteed* crash if the throughput of the writing
job is higher than the total S3 throughput, because there is never any backpressure or blocking
on calls to {{write}}.
> If {{fs.s3a.max.total.tasks}} is set high (the default is 1000), then {{write}} calls
will continue to add data to the queue, which can eventually OOM. But if the user tries to
set it lower, then writes will fail when the queue is full; the {{ThreadPoolExecutor}} will
reject the part with {{java.util.concurrent.RejectedExecutionException}}.
> Ideally, calls to {{write}} should *block, not fail* when the queue is full, so as to
apply backpressure on whatever the writing process is.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message