Return-Path: X-Original-To: apmail-spark-dev-archive@minotaur.apache.org Delivered-To: apmail-spark-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0CC6A118F8 for ; Fri, 8 Aug 2014 07:50:23 +0000 (UTC) Received: (qmail 99935 invoked by uid 500); 8 Aug 2014 07:50:13 -0000 Delivered-To: apmail-spark-dev-archive@spark.apache.org Received: (qmail 99874 invoked by uid 500); 8 Aug 2014 07:50:13 -0000 Mailing-List: contact dev-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@spark.apache.org Received: (qmail 99486 invoked by uid 99); 8 Aug 2014 07:50:13 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Aug 2014 07:50:13 +0000 X-ASF-Spam-Status: No, hits=1.8 required=10.0 tests=HTML_FONT_FACE_BAD,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sandy.ryza@cloudera.com designates 209.85.216.49 as permitted sender) Received: from [209.85.216.49] (HELO mail-qa0-f49.google.com) (209.85.216.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Aug 2014 07:50:10 +0000 Received: by mail-qa0-f49.google.com with SMTP id dc16so5122080qab.8 for ; Fri, 08 Aug 2014 00:49:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=t3Lv+qn48hkOB8J8YzlSG+t82g9fvA5/sQpYKiZSsYc=; b=BCql0X8WY9qYUw27Y5IlcfubLT52O+sGZCYET04VXvlHegHetajnFqDw/KxkyAwMmw b7GN85vIj0jCXvVHBG4z0a2oDBO4yp4Nt+sCIDOkpqxnblPW1VjCZzq2Evs4+lTgvRDI a5GiohuBPXTbc8gedRbrSO+o9+GiOh7b7zO6JX7/veNSgeq/eLRLs71HOQNRzqUqPxij 1DPnEz8Ug57dTVHGCdo4ak2BUZ5V7YO+0p9XdRqB5ZfX2XN6CG6Omv49goeqCuapxsN+ tQ8Q70Y97eIx6OM7Yyuk96xGqzeBtezHbX8uAXbwJ2VDNQ8nLffW/KKO2oXvCKiGuuyb YHxA== X-Gm-Message-State: ALoCoQmAUg7ajZAczwzpCnULnT1wReZPexs0Ue7Atb9CP9ywkELVR/5zTT/MTH0BGvPugYsKYH3v MIME-Version: 1.0 X-Received: by 10.140.84.138 with SMTP id l10mr20480141qgd.51.1407484184576; Fri, 08 Aug 2014 00:49:44 -0700 (PDT) Received: by 10.140.92.56 with HTTP; Fri, 8 Aug 2014 00:49:44 -0700 (PDT) In-Reply-To: References: Date: Fri, 8 Aug 2014 00:49:44 -0700 Message-ID: Subject: Re: Fine-Grained Scheduler on Yarn From: Sandy Ryza To: Jun Feng Liu Cc: "dev@spark.apache.org" , Patrick Wendell Content-Type: multipart/related; boundary=001a11c115a0df12ee050019727f X-Virus-Checked: Checked by ClamAV on apache.org --001a11c115a0df12ee050019727f Content-Type: multipart/alternative; boundary=001a11c115a0df12eb050019727e --001a11c115a0df12eb050019727e Content-Type: text/plain; charset=UTF-8 I think that would be useful work. I don't know the minute details of this code, but in general TaskSchedulerImpl keeps track of pending tasks. Tasks are organized into TaskSets, each of which corresponds to a particular stage. Each TaskSet has a TaskSetManager, which directly tracks the pending tasks for that stage. -Sandy On Fri, Aug 8, 2014 at 12:37 AM, Jun Feng Liu wrote: > Yes, I think we need both level resource control (container numbers and > dynamically change container resources), which can make the resource > utilization much more effective, especially when we have more types work > load share the same infrastructure. > > Is there anyway I can observe the tasks backlog in schedulerbackend? > Sounds like scheduler backend be triggered during new taskset submitted. I > did not figured if there is a way to check the whole backlog tasks inside > it. I am interesting to implement some policy in schedulerbackend and test > to see how useful it is going to be. > > Best Regards > > > *Jun Feng Liu* > IBM China Systems & Technology Laboratory in Beijing > > ------------------------------ > [image: 2D barcode - encoded with contact information] *Phone: *86-10-82452683 > > * E-mail:* *liujunf@cn.ibm.com* > [image: IBM] > > BLD 28,ZGC Software Park > No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 > China > > > > > > *Sandy Ryza >* > > 2014/08/08 15:14 > To > Jun Feng Liu/China/IBM@IBMCN, > cc > Patrick Wendell , "dev@spark.apache.org" < > dev@spark.apache.org> > Subject > Re: Fine-Grained Scheduler on Yarn > > > > > Hi Jun, > > Spark currently doesn't have that feature, i.e. it aims for a fixed number > of executors per application regardless of resource usage, but it's > definitely worth considering. We could start more executors when we have a > large backlog of tasks and shut some down when we're underutilized. > > The fine-grained task scheduling is blocked on work from YARN that will > allow changing the CPU allocation of a YARN container dynamically. The > relevant JIRA for this dependency is YARN-1197, though YARN-1488 might > serve this purpose as well if it comes first. > > -Sandy > > > On Thu, Aug 7, 2014 at 10:56 PM, Jun Feng Liu wrote: > > > Thanks for echo on this. Possible to adjust resource based on container > > numbers? e.g to allocate more container when driver need more resources > and > > return some resource by delete some container when parts of container > > already have enough cores/memory > > > > Best Regards > > > > > > *Jun Feng Liu* > > > > > IBM China Systems & Technology Laboratory in Beijing > > > > ------------------------------ > > > [image: 2D barcode - encoded with contact information] > > *Phone: *86-10-82452683 > > * E-mail:* *liujunf@cn.ibm.com* > > > [image: IBM] > > > > BLD 28,ZGC Software Park > > No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 > > China > > > > > > > > > > > > *Patrick Wendell >* > > > > > 2014/08/08 13:10 > > To > > Jun Feng Liu/China/IBM@IBMCN, > > cc > > "dev@spark.apache.org" > > Subject > > Re: Fine-Grained Scheduler on Yarn > > > > > > > > > > Hey sorry about that - what I said was the opposite of what is true. > > > > The current YARN mode is equivalent to "coarse grained" mesos. There is > no > > fine-grained scheduling on YARN at the moment. I'm not sure YARN supports > > scheduling in units other than containers. Fine-grained scheduling > requires > > scheduling at the granularity of individual cores. > > > > > > On Thu, Aug 7, 2014 at 9:43 PM, Patrick Wendell <*pwendell@gmail.com* > > > > wrote: > > The current YARN is equivalent to what is called "fine grained" mode in > > Mesos. The scheduling of tasks happens totally inside of the Spark > driver. > > > > > > On Thu, Aug 7, 2014 at 7:50 PM, Jun Feng Liu <*liujunf@cn.ibm.com* > > > > wrote: > > Any one know the answer? > > Best Regards > > > > > > * Jun Feng Liu* > > > > > IBM China Systems & Technology Laboratory in Beijing > > > > ------------------------------ > > *Phone: *86-10-82452683 > > * E-mail:* *liujunf@cn.ibm.com* > > > > > > > BLD 28,ZGC Software Park > > No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 > > China > > > > > > > > > > *Jun Feng Liu/China/IBM* > > > > 2014/08/07 15:37 > > > > To > > *dev@spark.apache.org* , > > > cc > > Subject > > Fine-Grained Scheduler on Yarn > > > > > > > > > > > > Hi, there > > > > Just aware right now Spark only support fine grained scheduler on Mesos > > with MesosSchedulerBackend. The Yarn schedule sounds like only works on > > coarse-grained model. Is there any plan to implement fine-grained > scheduler > > for YARN? Or there is any technical issue block us to do that. > > > > Best Regards > > > > > > * Jun Feng Liu* > > > > > IBM China Systems & Technology Laboratory in Beijing > > > > ------------------------------ > > *Phone: *86-10-82452683 > > * E-mail:* *liujunf@cn.ibm.com* > > > > > > > BLD 28,ZGC Software Park > > No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193 > > China > > > > > > > > > > > > > > > > --001a11c115a0df12eb050019727e Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I think that would be useful work. =C2=A0I don't know = the minute details of this code, but in general TaskSchedulerImpl keeps tra= ck of pending tasks. =C2=A0Tasks are organized into TaskSets, each of which= corresponds to a particular stage. =C2=A0Each TaskSet has a TaskSetManager= , which directly tracks the pending tasks for that stage.

-Sandy


On Fri, Aug 8, 2014 at 12:37 AM, Jun Feng Liu <liujun= f@cn.ibm.com> wrote:
Yes, I think we ne= ed both level resource control (container numbers and dynamically change container resources), which can make the resource utilization much more effective, especially when we have more types work load share the same infrastructure.

Is there anyway I can observe the tasks backlog in schedulerbackend? Sounds like scheduler backend be triggered during new taskset submitted. I did not figured if there is a way to check the whole backlog tasks inside it. I am interesting to implement some polic= y in schedulerbackend and test to see how useful it is going to be.

Best Regards

=C2=A0
Jun Feng Liu
IBM China Systems & Technology Laboratory in Beijing


=C2=A0



=3D"2D = Phone: 86-10-82452683
E-mail:
liujunf@cn.ibm.com
3D"IBM"
BLD 28,ZGC Software Park
No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193
China
=C2=A0
Sandy Ryza <sandy.ryza@cloud= era.com>

2014/08/08 15:14

To
Jun Feng Liu/China/IBM@IBMCN,=
cc
Patrick Wendell <pwendell@gmail.com>= , "dev@spark.a= pache.org" <dev@spark.apache.org>
Subject
Re: Fine-Grained Scheduler on= Yarn





Hi Jun,

Spark currently doesn't have that feature, i.e. it aims for a fixed num= ber
of executors per application regardless of resource usage, but it's
definitely worth considering. =C2=A0We could start more executors when we have a
large backlog of tasks and shut some down when we're underutilized.

The fine-grained task scheduling is blocked on work from YARN that will
allow changing the CPU allocation of a YARN container dynamically. =C2=A0Th= e
relevant JIRA for this dependency is YARN-1197, though YARN-1488 might
serve this purpose as well if it comes first.

-Sandy


On Thu, Aug 7, 2014 at 10:56 PM, Jun Feng Liu <liujunf@cn.ibm.com> wrote:

> Thanks for echo on this. Possible to adjust resource based on containe= r
> numbers? e.g to allocate more container when driver need more resource= s and
> return some resource by delete some container when parts of container<= br> > already have enough cores/memory
>
> Best Regards
>
>
> *Jun Feng Liu*

>
> IBM China Systems & Technology Laboratory in Beijing
>
> =C2=A0 ------------------------------

> =C2=A0[image: 2D barcode - encoded with contact information]
> *Phone: *86-10-82452683
> * E-mail:* *li= ujunf@cn.ibm.com* <liujunf@cn.ibm.com>

> [image: IBM]
>
> BLD 28,ZGC Software Park
> No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193
> China
>
>
>
>
>
> =C2=A0*Patrick Wendell <pwendell@gmail.com <pwendell@gmail.com>>*

>
> 2014/08/08 13:10
> =C2=A0 To
> Jun Feng Liu/China/IBM@IBMCN,
> cc
> "dev@sp= ark.apache.org" <dev@spark.apache.org>
> Subject
> Re: Fine-Grained Scheduler on Yarn
>
>
>
>
> Hey sorry about that - what I said was the opposite of what is true. >
> The current YARN mode is equivalent to "coarse grained" mesos. There is no
> fine-grained scheduling on YARN at the moment. I'm not sure YARN s= upports
> scheduling in units other than containers. Fine-grained scheduling requires
> scheduling at the granularity of individual cores.
>
>
> On Thu, Aug 7, 2014 at 9:43 PM, Patrick Wendell <*pwendell@gmail.com*

> <pwendell@g= mail.com>> wrote:
> The current YARN is equivalent to what is called "fine grained&qu= ot; mode in
> Mesos. The scheduling of tasks happens totally inside of the Spark driver.
>
>
> On Thu, Aug 7, 2014 at 7:50 PM, Jun Feng Liu <*liujunf@cn.ibm.com*
=
> <liujunf@cn= .ibm.com>> wrote:
> Any one know the answer?
> Best Regards
>
>
> * Jun Feng Liu*

>
> IBM China Systems & Technology Laboratory in Beijing
>
> =C2=A0 ------------------------------
> =C2=A0*Phone: *86-10-82452683
> * E-mail:* *li= ujunf@cn.ibm.com* <liujunf@cn.ibm.com>

>
>
> BLD 28,ZGC Software Park
> No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193
> China
>
>
>
>
> =C2=A0 *Jun Feng Liu/China/IBM*
>
> 2014/08/07 15:37
>
> =C2=A0 To
> *dev@spark.a= pache.org* <dev@spark.apache.org>,

> cc
> =C2=A0 Subject
> Fine-Grained Scheduler on Yarn
>
>
>
>
>
> Hi, there
>
> Just aware right now Spark only support fine grained scheduler on Mesos
> with MesosSchedulerBackend. The Yarn schedule sounds like only works on
> coarse-grained model. Is there any plan to implement fine-grained scheduler
> for YARN? Or there is any technical issue block us to do that.
>
> Best Regards
>
>
> * Jun Feng Liu*

>
> IBM China Systems & Technology Laboratory in Beijing
>
> =C2=A0 ------------------------------
> =C2=A0*Phone: *86-10-82452683
> * E-mail:* *li= ujunf@cn.ibm.com* <liujunf@cn.ibm.com>

>
>
> BLD 28,ZGC Software Park
> No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193
> China
>
>
>
>
>
>
>


--001a11c115a0df12eb050019727e-- --001a11c115a0df12ee050019727f--