spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chawla,Sumit " <sumitkcha...@gmail.com>
Subject Re: Mesos Spark Fine Grained Execution - CPU count
Date Tue, 27 Dec 2016 02:30:56 GMT
What is the expected effect of reducing the mesosExecutor.cores to zero?
What functionality of executor is impacted? Is the impact is just that it
just behaves like a regular process?

Regards
Sumit Chawla


On Mon, Dec 26, 2016 at 9:25 AM, Michael Gummelt <mgummelt@mesosphere.io>
wrote:

> > Using 0 for spark.mesos.mesosExecutor.cores is better than dynamic
> allocation
>
> Maybe for CPU, but definitely not for memory.  Executors never shut down
> in fine-grained mode, which means you only elastically grow and shrink CPU
> usage, not memory.
>
> On Sat, Dec 24, 2016 at 10:14 PM, Davies Liu <davies.liu@gmail.com> wrote:
>
>> Using 0 for spark.mesos.mesosExecutor.cores is better than dynamic
>> allocation, but have to pay a little more overhead for launching a
>> task, which should be OK if the task is not trivial.
>>
>> Since the direct result (up to 1M by default) will also go through
>> mesos, it's better to tune it lower, otherwise mesos could become the
>> bottleneck.
>>
>> spark.task.maxDirectResultSize
>>
>> On Mon, Dec 19, 2016 at 3:23 PM, Chawla,Sumit <sumitkchawla@gmail.com>
>> wrote:
>> > Tim,
>> >
>> > We will try to run the application in coarse grain mode, and share the
>> > findings with you.
>> >
>> > Regards
>> > Sumit Chawla
>> >
>> >
>> > On Mon, Dec 19, 2016 at 3:11 PM, Timothy Chen <tnachen@gmail.com>
>> wrote:
>> >
>> >> Dynamic allocation works with Coarse grain mode only, we wasn't aware
>> >> a need for Fine grain mode after we enabled dynamic allocation support
>> >> on the coarse grain mode.
>> >>
>> >> What's the reason you're running fine grain mode instead of coarse
>> >> grain + dynamic allocation?
>> >>
>> >> Tim
>> >>
>> >> On Mon, Dec 19, 2016 at 2:45 PM, Mehdi Meziane
>> >> <mehdi.meziane@ldmobile.net> wrote:
>> >> > We will be interested by the results if you give a try to Dynamic
>> >> allocation
>> >> > with mesos !
>> >> >
>> >> >
>> >> > ----- Mail Original -----
>> >> > De: "Michael Gummelt" <mgummelt@mesosphere.io>
>> >> > À: "Sumit Chawla" <sumitkchawla@gmail.com>
>> >> > Cc: user@mesos.apache.org, dev@mesos.apache.org, "User"
>> >> > <user@spark.apache.org>, dev@spark.apache.org
>> >> > Envoyé: Lundi 19 Décembre 2016 22h42:55 GMT +01:00 Amsterdam /
>> Berlin /
>> >> > Berne / Rome / Stockholm / Vienne
>> >> > Objet: Re: Mesos Spark Fine Grained Execution - CPU count
>> >> >
>> >> >
>> >> >> Is this problem of idle executors sticking around solved in Dynamic
>> >> >> Resource Allocation?  Is there some timeout after which Idle
>> executors
>> >> can
>> >> >> just shutdown and cleanup its resources.
>> >> >
>> >> > Yes, that's exactly what dynamic allocation does.  But again I have
>> no
>> >> idea
>> >> > what the state of dynamic allocation + mesos is.
>> >> >
>> >> > On Mon, Dec 19, 2016 at 1:32 PM, Chawla,Sumit <
>> sumitkchawla@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Great.  Makes much better sense now.  What will be reason to have
>> >> >> spark.mesos.mesosExecutor.cores more than 1, as this number doesn't
>> >> include
>> >> >> the number of cores for tasks.
>> >> >>
>> >> >> So in my case it seems like 30 CPUs are allocated to executors.
 And
>> >> there
>> >> >> are 48 tasks so 48 + 30 =  78 CPUs.  And i am noticing this gap
of
>> 30 is
>> >> >> maintained till the last task exits.  This explains the gap.
>>  Thanks
>> >> >> everyone.  I am still not sure how this number 30 is calculated.
 (
>> Is
>> >> it
>> >> >> dynamic based on current resources, or is it some configuration.
 I
>> >> have 32
>> >> >> nodes in my cluster).
>> >> >>
>> >> >> Is this problem of idle executors sticking around solved in Dynamic
>> >> >> Resource Allocation?  Is there some timeout after which Idle
>> executors
>> >> can
>> >> >> just shutdown and cleanup its resources.
>> >> >>
>> >> >>
>> >> >> Regards
>> >> >> Sumit Chawla
>> >> >>
>> >> >>
>> >> >> On Mon, Dec 19, 2016 at 12:45 PM, Michael Gummelt <
>> >> mgummelt@mesosphere.io>
>> >> >> wrote:
>> >> >>>
>> >> >>> >  I should preassume that No of executors should be less
than
>> number
>> >> of
>> >> >>> > tasks.
>> >> >>>
>> >> >>> No.  Each executor runs 0 or more tasks.
>> >> >>>
>> >> >>> Each executor consumes 1 CPU, and each task running on that
>> executor
>> >> >>> consumes another CPU.  You can customize this via
>> >> >>> spark.mesos.mesosExecutor.cores
>> >> >>> (https://github.com/apache/spark/blob/v1.6.3/docs/running-
>> on-mesos.md)
>> >> and
>> >> >>> spark.task.cpus
>> >> >>> (https://github.com/apache/spark/blob/v1.6.3/docs/configuration.md
>> )
>> >> >>>
>> >> >>> On Mon, Dec 19, 2016 at 12:09 PM, Chawla,Sumit <
>> sumitkchawla@gmail.com
>> >> >
>> >> >>> wrote:
>> >> >>>>
>> >> >>>> Ah thanks. looks like i skipped reading this "Neither will
>> executors
>> >> >>>> terminate when they’re idle."
>> >> >>>>
>> >> >>>> So in my job scenario,  I should preassume that No of executors
>> should
>> >> >>>> be less than number of tasks. Ideally one executor should
execute
>> 1
>> >> or more
>> >> >>>> tasks.  But i am observing something strange instead. 
I start my
>> job
>> >> with
>> >> >>>> 48 partitions for a spark job. In mesos ui i see that number
of
>> tasks
>> >> is 48,
>> >> >>>> but no. of CPUs is 78 which is way more than 48.  Here
i am
>> assuming
>> >> that 1
>> >> >>>> CPU is 1 executor.   I am not specifying any configuration
to set
>> >> number of
>> >> >>>> cores per executor.
>> >> >>>>
>> >> >>>> Regards
>> >> >>>> Sumit Chawla
>> >> >>>>
>> >> >>>>
>> >> >>>> On Mon, Dec 19, 2016 at 11:35 AM, Joris Van Remoortere
>> >> >>>> <joris@mesosphere.io> wrote:
>> >> >>>>>
>> >> >>>>> That makes sense. From the documentation it looks like
the
>> executors
>> >> >>>>> are not supposed to terminate:
>> >> >>>>>
>> >> >>>>> http://spark.apache.org/docs/latest/running-on-mesos.html#
>> >> fine-grained-deprecated
>> >> >>>>>>
>> >> >>>>>> Note that while Spark tasks in fine-grained will
relinquish
>> cores as
>> >> >>>>>> they terminate, they will not relinquish memory,
as the JVM does
>> >> not give
>> >> >>>>>> memory back to the Operating System. Neither will
executors
>> >> terminate when
>> >> >>>>>> they’re idle.
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> I suppose your task to executor CPU ratio is low enough
that it
>> looks
>> >> >>>>> like most of the resources are not being reclaimed.
If your tasks
>> >> were using
>> >> >>>>> significantly more CPU the amortized cost of the idle
executors
>> >> would not be
>> >> >>>>> such a big deal.
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> —
>> >> >>>>> Joris Van Remoortere
>> >> >>>>> Mesosphere
>> >> >>>>>
>> >> >>>>> On Mon, Dec 19, 2016 at 11:26 AM, Timothy Chen <
>> tnachen@gmail.com>
>> >> >>>>> wrote:
>> >> >>>>>>
>> >> >>>>>> Hi Chawla,
>> >> >>>>>>
>> >> >>>>>> One possible reason is that Mesos fine grain mode
also takes up
>> >> cores
>> >> >>>>>> to run the executor per host, so if you have 20
agents running
>> Fine
>> >> >>>>>> grained executor it will take up 20 cores while
it's still
>> running.
>> >> >>>>>>
>> >> >>>>>> Tim
>> >> >>>>>>
>> >> >>>>>> On Fri, Dec 16, 2016 at 8:41 AM, Chawla,Sumit <
>> >> sumitkchawla@gmail.com>
>> >> >>>>>> wrote:
>> >> >>>>>> > Hi
>> >> >>>>>> >
>> >> >>>>>> > I am using Spark 1.6. I have one query about
Fine Grained
>> model in
>> >> >>>>>> > Spark.
>> >> >>>>>> > I have a simple Spark application which transforms
A -> B.
>> Its a
>> >> >>>>>> > single
>> >> >>>>>> > stage application.  To begin the program,
It starts with 48
>> >> >>>>>> > partitions.
>> >> >>>>>> > When the program starts running, in mesos
UI it shows 48
>> tasks and
>> >> >>>>>> > 48 CPUs
>> >> >>>>>> > allocated to job.  Now as the tasks get done,
the number of
>> active
>> >> >>>>>> > tasks
>> >> >>>>>> > number starts decreasing.  How ever, the number
of CPUs does
>> not
>> >> >>>>>> > decrease
>> >> >>>>>> > propotionally.  When the job was about to
finish, there was a
>> >> single
>> >> >>>>>> > remaininig task, however CPU count was still
20.
>> >> >>>>>> >
>> >> >>>>>> > My questions, is why there is no one to one
mapping between
>> tasks
>> >> >>>>>> > and cpus
>> >> >>>>>> > in Fine grained?  How can these CPUs be released
when the job
>> is
>> >> >>>>>> > done, so
>> >> >>>>>> > that other jobs can start.
>> >> >>>>>> >
>> >> >>>>>> >
>> >> >>>>>> > Regards
>> >> >>>>>> > Sumit Chawla
>> >> >>>>>
>> >> >>>>>
>> >> >>>>
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>> --
>> >> >>> Michael Gummelt
>> >> >>> Software Engineer
>> >> >>> Mesosphere
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Michael Gummelt
>> >> > Software Engineer
>> >> > Mesosphere
>> >>
>>
>>
>>
>> --
>>  - Davies
>>
>
>
>
> --
> Michael Gummelt
> Software Engineer
> Mesosphere
>

Mime
View raw message