mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timothy Chen <tnac...@gmail.com>
Subject Re: Mesos Spark Fine Grained Execution - CPU count
Date Mon, 19 Dec 2016 23:11:56 GMT
Dynamic allocation works with Coarse grain mode only, we wasn't aware
a need for Fine grain mode after we enabled dynamic allocation support
on the coarse grain mode.

What's the reason you're running fine grain mode instead of coarse
grain + dynamic allocation?

Tim

On Mon, Dec 19, 2016 at 2:45 PM, Mehdi Meziane
<mehdi.meziane@ldmobile.net> wrote:
> We will be interested by the results if you give a try to Dynamic allocation
> with mesos !
>
>
> ----- Mail Original -----
> De: "Michael Gummelt" <mgummelt@mesosphere.io>
> À: "Sumit Chawla" <sumitkchawla@gmail.com>
> Cc: user@mesos.apache.org, dev@mesos.apache.org, "User"
> <user@spark.apache.org>, dev@spark.apache.org
> Envoyé: Lundi 19 Décembre 2016 22h42:55 GMT +01:00 Amsterdam / Berlin /
> Berne / Rome / Stockholm / Vienne
> Objet: Re: Mesos Spark Fine Grained Execution - CPU count
>
>
>> Is this problem of idle executors sticking around solved in Dynamic
>> Resource Allocation?  Is there some timeout after which Idle executors can
>> just shutdown and cleanup its resources.
>
> Yes, that's exactly what dynamic allocation does.  But again I have no idea
> what the state of dynamic allocation + mesos is.
>
> On Mon, Dec 19, 2016 at 1:32 PM, Chawla,Sumit <sumitkchawla@gmail.com>
> wrote:
>>
>> Great.  Makes much better sense now.  What will be reason to have
>> spark.mesos.mesosExecutor.cores more than 1, as this number doesn't include
>> the number of cores for tasks.
>>
>> So in my case it seems like 30 CPUs are allocated to executors.  And there
>> are 48 tasks so 48 + 30 =  78 CPUs.  And i am noticing this gap of 30 is
>> maintained till the last task exits.  This explains the gap.   Thanks
>> everyone.  I am still not sure how this number 30 is calculated.  ( Is it
>> dynamic based on current resources, or is it some configuration.  I have 32
>> nodes in my cluster).
>>
>> Is this problem of idle executors sticking around solved in Dynamic
>> Resource Allocation?  Is there some timeout after which Idle executors can
>> just shutdown and cleanup its resources.
>>
>>
>> Regards
>> Sumit Chawla
>>
>>
>> On Mon, Dec 19, 2016 at 12:45 PM, Michael Gummelt <mgummelt@mesosphere.io>
>> wrote:
>>>
>>> >  I should preassume that No of executors should be less than number of
>>> > tasks.
>>>
>>> No.  Each executor runs 0 or more tasks.
>>>
>>> Each executor consumes 1 CPU, and each task running on that executor
>>> consumes another CPU.  You can customize this via
>>> spark.mesos.mesosExecutor.cores
>>> (https://github.com/apache/spark/blob/v1.6.3/docs/running-on-mesos.md) and
>>> spark.task.cpus
>>> (https://github.com/apache/spark/blob/v1.6.3/docs/configuration.md)
>>>
>>> On Mon, Dec 19, 2016 at 12:09 PM, Chawla,Sumit <sumitkchawla@gmail.com>
>>> wrote:
>>>>
>>>> Ah thanks. looks like i skipped reading this "Neither will executors
>>>> terminate when they’re idle."
>>>>
>>>> So in my job scenario,  I should preassume that No of executors should
>>>> be less than number of tasks. Ideally one executor should execute 1 or more
>>>> tasks.  But i am observing something strange instead.  I start my job with
>>>> 48 partitions for a spark job. In mesos ui i see that number of tasks is
48,
>>>> but no. of CPUs is 78 which is way more than 48.  Here i am assuming that
1
>>>> CPU is 1 executor.   I am not specifying any configuration to set number
of
>>>> cores per executor.
>>>>
>>>> Regards
>>>> Sumit Chawla
>>>>
>>>>
>>>> On Mon, Dec 19, 2016 at 11:35 AM, Joris Van Remoortere
>>>> <joris@mesosphere.io> wrote:
>>>>>
>>>>> That makes sense. From the documentation it looks like the executors
>>>>> are not supposed to terminate:
>>>>>
>>>>> http://spark.apache.org/docs/latest/running-on-mesos.html#fine-grained-deprecated
>>>>>>
>>>>>> Note that while Spark tasks in fine-grained will relinquish cores
as
>>>>>> they terminate, they will not relinquish memory, as the JVM does
not give
>>>>>> memory back to the Operating System. Neither will executors terminate
when
>>>>>> they’re idle.
>>>>>
>>>>>
>>>>> I suppose your task to executor CPU ratio is low enough that it looks
>>>>> like most of the resources are not being reclaimed. If your tasks were
using
>>>>> significantly more CPU the amortized cost of the idle executors would
not be
>>>>> such a big deal.
>>>>>
>>>>>
>>>>> —
>>>>> Joris Van Remoortere
>>>>> Mesosphere
>>>>>
>>>>> On Mon, Dec 19, 2016 at 11:26 AM, Timothy Chen <tnachen@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Hi Chawla,
>>>>>>
>>>>>> One possible reason is that Mesos fine grain mode also takes up cores
>>>>>> to run the executor per host, so if you have 20 agents running Fine
>>>>>> grained executor it will take up 20 cores while it's still running.
>>>>>>
>>>>>> Tim
>>>>>>
>>>>>> On Fri, Dec 16, 2016 at 8:41 AM, Chawla,Sumit <sumitkchawla@gmail.com>
>>>>>> wrote:
>>>>>> > Hi
>>>>>> >
>>>>>> > I am using Spark 1.6. I have one query about Fine Grained model
in
>>>>>> > Spark.
>>>>>> > I have a simple Spark application which transforms A -> B.
 Its a
>>>>>> > single
>>>>>> > stage application.  To begin the program, It starts with 48
>>>>>> > partitions.
>>>>>> > When the program starts running, in mesos UI it shows 48 tasks
and
>>>>>> > 48 CPUs
>>>>>> > allocated to job.  Now as the tasks get done, the number of
active
>>>>>> > tasks
>>>>>> > number starts decreasing.  How ever, the number of CPUs does
not
>>>>>> > decrease
>>>>>> > propotionally.  When the job was about to finish, there was
a single
>>>>>> > remaininig task, however CPU count was still 20.
>>>>>> >
>>>>>> > My questions, is why there is no one to one mapping between
tasks
>>>>>> > and cpus
>>>>>> > in Fine grained?  How can these CPUs be released when the job
is
>>>>>> > done, so
>>>>>> > that other jobs can start.
>>>>>> >
>>>>>> >
>>>>>> > Regards
>>>>>> > Sumit Chawla
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Michael Gummelt
>>> Software Engineer
>>> Mesosphere
>>
>>
>
>
>
> --
> Michael Gummelt
> Software Engineer
> Mesosphere

Mime
View raw message