mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mehdi Meziane <mehdi.mezi...@ldmobile.net>
Subject Re: Mesos Spark Fine Grained Execution - CPU count
Date Mon, 19 Dec 2016 22:45:31 GMT
We will be interested by the results if you give a try to Dynamic allocation with mesos ! 



----- Mail Original ----- 
De: "Michael Gummelt" <mgummelt@mesosphere.io> 
À: "Sumit Chawla" <sumitkchawla@gmail.com> 
Cc: user@mesos.apache.org, dev@mesos.apache.org, "User" <user@spark.apache.org>, dev@spark.apache.org

Envoyé: Lundi 19 Décembre 2016 22h42:55 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm
/ Vienne 
Objet: Re: Mesos Spark Fine Grained Execution - CPU count 



> Is this problem of idle executors sticking around solved in Dynamic Resource Allocation?
Is there some timeout after which Idle executors can just shutdown and cleanup its resources.


Yes, that's exactly what dynamic allocation does. But again I have no idea what the state
of dynamic allocation + mesos is. 



On Mon, Dec 19, 2016 at 1:32 PM, Chawla,Sumit < sumitkchawla@gmail.com > wrote: 



Great. Makes much better sense now. What will be reason to have spark.mesos.mesosExecutor.
cores more than 1, as this number doesn't include the number of cores for tasks. 


So in my case it seems like 30 CPUs are allocated to executors. And there are 48 tasks so
48 + 30 = 78 CPUs. And i am noticing this gap of 30 is maintained till the last task exits.
This explains the gap. Thanks everyone. I am still not sure how this number 30 is calculated.
( Is it dynamic based on current resources, or is it some configuration. I have 32 nodes in
my cluster). 


Is this problem of idle executors sticking around solved in Dynamic Resource Allocation? Is
there some timeout after which Idle executors can just shutdown and cleanup its resources.






Regards 
Sumit Chawla 





On Mon, Dec 19, 2016 at 12:45 PM, Michael Gummelt < mgummelt@mesosphere.io > wrote:






> I should preassume that No of executors should be less than number of tasks. 

No. Each executor runs 0 or more tasks. 

Each executor consumes 1 CPU, and each task running on that executor consumes another CPU.
You can customize this via spark.mesos.mesosExecutor.cores ( https://github.com/apache/spark/blob/v1.6.3/docs/running-on-mesos.md
) and spark.task.cpus ( https://github.com/apache/spark/blob/v1.6.3/docs/configuration.md
) 





On Mon, Dec 19, 2016 at 12:09 PM, Chawla,Sumit < sumitkchawla@gmail.com > wrote: 



Ah thanks. looks like i skipped reading this " Neither will executors terminate when they’re
idle." 


So in my job scenario, I should preassume that No of executors should be less than number
of tasks. Ideally one executor should execute 1 or more tasks. But i am observing something
strange instead. I start my job with 48 partitions for a spark job. In mesos ui i see that
number of tasks is 48, but no. of CPUs is 78 which is way more than 48. Here i am assuming
that 1 CPU is 1 executor. I am not specifying any configuration to set number of cores per
executor. 



Regards 
Sumit Chawla 





On Mon, Dec 19, 2016 at 11:35 AM, Joris Van Remoortere < joris@mesosphere.io > wrote:




That makes sense. From the documentation it looks like the executors are not supposed to terminate:

http://spark.apache.org/docs/latest/running-on-mesos.html#fine-grained-deprecated 


Note that while Spark tasks in fine-grained will relinquish cores as they terminate, they
will not relinquish memory, as the JVM does not give memory back to the Operating System.
Neither will executors terminate when they’re idle. 


I suppose your task to executor CPU ratio is low enough that it looks like most of the resources
are not being reclaimed. If your tasks were using significantly more CPU the amortized cost
of the idle executors would not be such a big deal. 






— 
Joris Van Remoortere 
Mesosphere 

On Mon, Dec 19, 2016 at 11:26 AM, Timothy Chen < tnachen@gmail.com > wrote: 


Hi Chawla, 

One possible reason is that Mesos fine grain mode also takes up cores 
to run the executor per host, so if you have 20 agents running Fine 
grained executor it will take up 20 cores while it's still running. 

Tim 

On Fri, Dec 16, 2016 at 8:41 AM, Chawla,Sumit < sumitkchawla@gmail.com > wrote: 


> Hi 
> 
> I am using Spark 1.6. I have one query about Fine Grained model in Spark. 
> I have a simple Spark application which transforms A -> B. Its a single 
> stage application. To begin the program, It starts with 48 partitions. 
> When the program starts running, in mesos UI it shows 48 tasks and 48 CPUs 
> allocated to job. Now as the tasks get done, the number of active tasks 
> number starts decreasing. How ever, the number of CPUs does not decrease 
> propotionally. When the job was about to finish, there was a single 
> remaininig task, however CPU count was still 20. 
> 
> My questions, is why there is no one to one mapping between tasks and cpus 
> in Fine grained? How can these CPUs be released when the job is done, so 
> that other jobs can start. 
> 
> 
> Regards 
> Sumit Chawla 





-- 







Michael Gummelt 
Software Engineer 
Mesosphere 




-- 







Michael Gummelt 
Software Engineer 
Mesosphere 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message