hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Mackrory (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13826) [s3a] Deadlock possible using Amazon S3 SDK
Date Tue, 22 Nov 2016 00:07:58 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sean Mackrory updated HADOOP-13826:
-----------------------------------
    Attachment: HADOOP-13826.002.patch

Attaching a proof-of-concept of my proposed solution. It still needs some polish and has the
major drawback of depending on classes in com.amazonaws.services.s3.transfer.internal. It
also has the major drawback of not working. It can work with more concurrent renames, but
it would appear there isn't a simple division between 'control tasks' and 'sub tasks'. I had
the control task pool fill up while the subtask pool was still empty, and it deadlocked. Things
considered a control task can spawn other control tasks. I don't think tasks ever spawn other
tasks of the same type, so I'm going to try just having another tier for the other control
tasks. 

There's also a question once all the other obstacles are out of the way, about how this gets
configured. It's no longer a single pool of resources, yet it's configured that way. Maybe
we have a rule of thumb that 20% of the threads are for control tasks, and the rest are for
subtasks, or something along those lines?

> [s3a] Deadlock possible using Amazon S3 SDK
> -------------------------------------------
>
>                 Key: HADOOP-13826
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13826
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Sean Mackrory
>         Attachments: HADOOP-13826.001.patch, HADOOP-13826.002.patch
>
>
> In testing HIVE-15093 we have encountered deadlocks in the s3a connector. The TransferManager
javadocs (http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html)
explain how this is possible:
> {quote}It is not recommended to use a single threaded executor or a thread pool with
a bounded work queue as control tasks may submit subtasks that can't complete until all sub
tasks complete. Using an incorrectly configured thread pool may cause a deadlock (I.E. the
work queue is filled with control tasks that can't finish until subtasks complete but subtasks
can't execute because the queue is filled).{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message