spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sandy Ryza <sandy.r...@cloudera.com>
Subject Re: Fine-Grained Scheduler on Yarn
Date Fri, 08 Aug 2014 07:49:44 GMT
I think that would be useful work.  I don't know the minute details of this
code, but in general TaskSchedulerImpl keeps track of pending tasks.  Tasks
are organized into TaskSets, each of which corresponds to a particular
stage.  Each TaskSet has a TaskSetManager, which directly tracks the
pending tasks for that stage.

-Sandy


On Fri, Aug 8, 2014 at 12:37 AM, Jun Feng Liu <liujunf@cn.ibm.com> wrote:

> Yes, I think we need both level resource control (container numbers and
> dynamically change container resources), which can make the resource
> utilization much more effective, especially when we have more types work
> load share the same infrastructure.
>
> Is there anyway I can observe the tasks backlog in schedulerbackend?
> Sounds like scheduler backend be triggered during new taskset submitted. I
> did not figured if there is a way to check the whole backlog tasks inside
> it. I am interesting to implement some policy in schedulerbackend and test
> to see how useful it is going to be.
>
> Best Regards
>
>
> *Jun Feng Liu*
> IBM China Systems & Technology Laboratory in Beijing
>
>   ------------------------------
>  [image: 2D barcode - encoded with contact information] *Phone: *86-10-82452683
>
> * E-mail:* *liujunf@cn.ibm.com* <liujunf@cn.ibm.com>
> [image: IBM]
>
> BLD 28,ZGC Software Park
> No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193
> China
>
>
>
>
>
>  *Sandy Ryza <sandy.ryza@cloudera.com <sandy.ryza@cloudera.com>>*
>
> 2014/08/08 15:14
>   To
> Jun Feng Liu/China/IBM@IBMCN,
> cc
> Patrick Wendell <pwendell@gmail.com>, "dev@spark.apache.org" <
> dev@spark.apache.org>
> Subject
> Re: Fine-Grained Scheduler on Yarn
>
>
>
>
> Hi Jun,
>
> Spark currently doesn't have that feature, i.e. it aims for a fixed number
> of executors per application regardless of resource usage, but it's
> definitely worth considering.  We could start more executors when we have a
> large backlog of tasks and shut some down when we're underutilized.
>
> The fine-grained task scheduling is blocked on work from YARN that will
> allow changing the CPU allocation of a YARN container dynamically.  The
> relevant JIRA for this dependency is YARN-1197, though YARN-1488 might
> serve this purpose as well if it comes first.
>
> -Sandy
>
>
> On Thu, Aug 7, 2014 at 10:56 PM, Jun Feng Liu <liujunf@cn.ibm.com> wrote:
>
> > Thanks for echo on this. Possible to adjust resource based on container
> > numbers? e.g to allocate more container when driver need more resources
> and
> > return some resource by delete some container when parts of container
> > already have enough cores/memory
> >
> > Best Regards
> >
> >
> > *Jun Feng Liu*
>
> >
> > IBM China Systems & Technology Laboratory in Beijing
> >
> >   ------------------------------
>
> >  [image: 2D barcode - encoded with contact information]
> > *Phone: *86-10-82452683
> > * E-mail:* *liujunf@cn.ibm.com* <liujunf@cn.ibm.com>
>
> > [image: IBM]
> >
> > BLD 28,ZGC Software Park
> > No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193
> > China
> >
> >
> >
> >
> >
> >  *Patrick Wendell <pwendell@gmail.com <pwendell@gmail.com>>*
>
> >
> > 2014/08/08 13:10
> >   To
> > Jun Feng Liu/China/IBM@IBMCN,
> > cc
> > "dev@spark.apache.org" <dev@spark.apache.org>
> > Subject
> > Re: Fine-Grained Scheduler on Yarn
> >
> >
> >
> >
> > Hey sorry about that - what I said was the opposite of what is true.
> >
> > The current YARN mode is equivalent to "coarse grained" mesos. There is
> no
> > fine-grained scheduling on YARN at the moment. I'm not sure YARN supports
> > scheduling in units other than containers. Fine-grained scheduling
> requires
> > scheduling at the granularity of individual cores.
> >
> >
> > On Thu, Aug 7, 2014 at 9:43 PM, Patrick Wendell <*pwendell@gmail.com*
>
> > <pwendell@gmail.com>> wrote:
> > The current YARN is equivalent to what is called "fine grained" mode in
> > Mesos. The scheduling of tasks happens totally inside of the Spark
> driver.
> >
> >
> > On Thu, Aug 7, 2014 at 7:50 PM, Jun Feng Liu <*liujunf@cn.ibm.com*
>
> > <liujunf@cn.ibm.com>> wrote:
> > Any one know the answer?
> > Best Regards
> >
> >
> > * Jun Feng Liu*
>
> >
> > IBM China Systems & Technology Laboratory in Beijing
> >
> >   ------------------------------
> >  *Phone: *86-10-82452683
> > * E-mail:* *liujunf@cn.ibm.com* <liujunf@cn.ibm.com>
>
> >
> >
> > BLD 28,ZGC Software Park
> > No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193
> > China
> >
> >
> >
> >
> >   *Jun Feng Liu/China/IBM*
> >
> > 2014/08/07 15:37
> >
> >   To
> > *dev@spark.apache.org* <dev@spark.apache.org>,
>
> > cc
> >   Subject
> > Fine-Grained Scheduler on Yarn
> >
> >
> >
> >
> >
> > Hi, there
> >
> > Just aware right now Spark only support fine grained scheduler on Mesos
> > with MesosSchedulerBackend. The Yarn schedule sounds like only works on
> > coarse-grained model. Is there any plan to implement fine-grained
> scheduler
> > for YARN? Or there is any technical issue block us to do that.
> >
> > Best Regards
> >
> >
> > * Jun Feng Liu*
>
> >
> > IBM China Systems & Technology Laboratory in Beijing
> >
> >   ------------------------------
> >  *Phone: *86-10-82452683
> > * E-mail:* *liujunf@cn.ibm.com* <liujunf@cn.ibm.com>
>
> >
> >
> > BLD 28,ZGC Software Park
> > No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193
> > China
> >
> >
> >
> >
> >
> >
> >
>
>

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message