Mailing-List: contact issues-help@spark.apache.org; run by ezmlm
Precedence: bulk
Date: Thu, 20 Jul 2017 23:23:00 +0000 (UTC)
From: "Thomas Graves (JIRA)" <jira@apache.org>
To: issues@spark.apache.org
Message-ID: <JIRA.13087983.1500397362000.297742.1500592980117@Atlassian.JIRA>
In-Reply-To: <JIRA.13087983.1500397362000@Atlassian.JIRA>
References: <JIRA.13087983.1500397362000@Atlassian.JIRA> <JIRA.13087983.1500397362848@jira-lw-us.apache.org>
Subject: [jira] [Commented] (SPARK-21460) Spark dynamic allocation breaks
 when ListenerBus event queue runs full
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Thu, 20 Jul 2017 23:23:03 -0000


    [ https://issues.apache.org/jira/browse/SPARK-21460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095546#comment-16095546 ] 

Thomas Graves commented on SPARK-21460:
---------------------------------------

I didn't think that was the case, but took a look at the code and I guess I was wrong, it definitely appears to be reliant on the listener bus.  That is really bad in my opinion. We are intentionally dropping events and we know that will cause issues.

> Spark dynamic allocation breaks when ListenerBus event queue runs full
> ----------------------------------------------------------------------
>
>                 Key: SPARK-21460
>                 URL: https://issues.apache.org/jira/browse/SPARK-21460
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler, YARN
>    Affects Versions: 2.0.0, 2.0.2, 2.1.0, 2.1.1, 2.2.0
>         Environment: Spark 2.1 
> Hadoop 2.6
>            Reporter: Ruslan Dautkhanov
>            Priority: Critical
>              Labels: dynamic_allocation, performance, scheduler, yarn
>
> When ListenerBus event queue runs full, spark dynamic allocation stops working - Spark fails to shrink number of executors when there are no active jobs (Spark driver "thinks" there are active jobs since it didn't capture when they finished) .
> ps. What's worse it also makes Spark flood YARN RM with reservation requests, so YARN preemption doesn't function properly too (we're on Spark 2.1 / Hadoop 2.6). 


--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org