hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Mackrory (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13826) [s3a] Deadlock possible using Amazon S3 SDK
Date Mon, 21 Nov 2016 20:21:58 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Sean Mackrory updated HADOOP-13826:
    Attachment: HADOOP-13826.001.patch

Just attaching an integration test that reproduces the problem. It'll create 2 64 MB files
and try to rename them both at once with multi-part uploads. It can't complete.

One can work around this problem by increasing the number of threads (Amazon recommends an
unbounded executor). I would suggest that perhaps a better approach is to implement the executor
with 2 separate thread pools and task queues: one for "control tasks" and one for "sub tasks".
This could make configuring maximum threads, etc. a little tricky, but without just removing
all limits, this will at least allow real work to progress even when the pool / queue for
control tasks is saturated. Unfortunately it looks like we'd have to rely on classes internal
to the S3 SDK to do this, so I'm going to open an issue with them to see if they can provide
a more explicitly supported way to achieve something like this.

> [s3a] Deadlock possible using Amazon S3 SDK
> -------------------------------------------
>                 Key: HADOOP-13826
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13826
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Sean Mackrory
>         Attachments: HADOOP-13826.001.patch
> In testing HIVE-15093 we have encountered deadlocks in the s3a connector. The TransferManager
javadocs (http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html)
explain how this is possible:
> {quote}It is not recommended to use a single threaded executor or a thread pool with
a bounded work queue as control tasks may submit subtasks that can't complete until all sub
tasks complete. Using an incorrectly configured thread pool may cause a deadlock (I.E. the
work queue is filled with control tasks that can't finish until subtasks complete but subtasks
can't execute because the queue is filled).{quote}

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message