spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Graves (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-11701) YARN - dynamic allocation and speculation active task accounting wrong
Date Tue, 01 Dec 2015 16:46:10 GMT

    [ https://issues.apache.org/jira/browse/SPARK-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034042#comment-15034042
] 

Thomas Graves commented on SPARK-11701:
---------------------------------------

tested on latest 1.6 branch and I am no longer seeing the TransportResponseHandler exception.
I do still see the original issue.

Looking at the logs it seems there is an info message printed near the end on tasks that are
on executors still showing active tasks. I'm guessing it is ignoring this and not doing the
accounting properly.

15/12/01 16:35:16 INFO TaskSetManager: Ignoring task-finished event for 25.1 in stage 0.0
because task 25 has already completed successfully


> YARN - dynamic allocation and speculation active task accounting wrong
> ----------------------------------------------------------------------
>
>                 Key: SPARK-11701
>                 URL: https://issues.apache.org/jira/browse/SPARK-11701
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.5.1
>            Reporter: Thomas Graves
>            Priority: Critical
>
> I am using dynamic container allocation and speculation and am seeing issues with the
active task accounting.  The Executor UI still shows active tasks on the an executor but the
job/stage is all completed.  I think its also affecting the dynamic allocation being able
to release containers because it thinks there are still tasks.
> Its easily reproduce by using spark-shell, turn on dynamic allocation, then run just
a wordcount on decent sized file and set the speculation parameters low: 
>  spark.dynamicAllocation.enabled true
>  spark.shuffle.service.enabled true
>  spark.dynamicAllocation.maxExecutors 10
>  spark.dynamicAllocation.minExecutors 2
>  spark.dynamicAllocation.initialExecutors 10
>  spark.dynamicAllocation.executorIdleTimeout 40s
> $SPARK_HOME/bin/spark-shell --conf spark.speculation=true --conf spark.speculation.multiplier=0.2
--conf spark.speculation.quantile=0.1 --master yarn --deploy-mode client  --executor-memory
4g --driver-memory 4g



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message