hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhihai xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2594) Potential deadlock in RM when querying ApplicationResourceUsageReport
Date Fri, 26 Sep 2014 03:06:34 GMT

    [ https://issues.apache.org/jira/browse/YARN-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148665#comment-14148665
] 

zhihai xu commented on YARN-2594:
---------------------------------

The [ReentrantReadWriteLock | http://tutorials.jenkov.com/java-util-concurrent/readwritelock.html]
implementation  is 
{code}
Read Lock   	If no threads have locked the ReadWriteLock for writing, 
and no thread have requested a write lock (but not yet obtained it). 
Thus, multiple threads can lock the lock for reading.
Write Lock   	If no threads are reading or writing. 
Thus, only one thread at a time can lock the lock for writing
{code}
Base on the above information, the first three threads can cause a deadlock,
The readLock is firstly acquired by thread#1, then thread#3 is blocked for writeLock, finally
when Thread#2 try to acquire the readLock, thread#2 is also blocked because thread#3 is requesting
the writeLock before thread#2. 
So this is not a bug in Java.
The following is the source code in ReentrantReadWriteLock.java:
{code}
    static final class NonfairSync extends Sync {
        private static final long serialVersionUID = -8159625535654395037L;
        final boolean writerShouldBlock() {
            return false; // writers can always barge
        }
        final boolean readerShouldBlock() {
            /* As a heuristic to avoid indefinite writer starvation,
             * block if the thread that momentarily appears to be head
             * of queue, if one exists, is a waiting writer.  This is
             * only a probabilistic effect since a new reader will not
             * block if there is a waiting writer behind other enabled
             * readers that have not yet drained from the queue.
             */
            return apparentlyFirstQueuedIsExclusive();
        }
    }
{code}
readerShouldBlock will check whether any threads request writeLock before it.

> Potential deadlock in RM when querying ApplicationResourceUsageReport
> ---------------------------------------------------------------------
>
>                 Key: YARN-2594
>                 URL: https://issues.apache.org/jira/browse/YARN-2594
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: Karam Singh
>            Assignee: Wangda Tan
>            Priority: Blocker
>         Attachments: YARN-2594.patch
>
>
> ResoruceManager sometimes become un-responsive:
> There was in exception in ResourceManager log and contains only  following type of messages:
> {code}
> 2014-09-19 19:13:45,241 INFO  event.AsyncDispatcher (AsyncDispatcher.java:handle(232))
- Size of event-queue is 53000
> 2014-09-19 19:30:26,312 INFO  event.AsyncDispatcher (AsyncDispatcher.java:handle(232))
- Size of event-queue is 54000
> 2014-09-19 19:47:07,351 INFO  event.AsyncDispatcher (AsyncDispatcher.java:handle(232))
- Size of event-queue is 55000
> 2014-09-19 20:03:48,460 INFO  event.AsyncDispatcher (AsyncDispatcher.java:handle(232))
- Size of event-queue is 56000
> 2014-09-19 20:20:29,542 INFO  event.AsyncDispatcher (AsyncDispatcher.java:handle(232))
- Size of event-queue is 57000
> 2014-09-19 20:37:10,635 INFO  event.AsyncDispatcher (AsyncDispatcher.java:handle(232))
- Size of event-queue is 58000
> 2014-09-19 20:53:51,722 INFO  event.AsyncDispatcher (AsyncDispatcher.java:handle(232))
- Size of event-queue is 59000
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message