Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 21A19E4A8 for ; Sat, 26 Jan 2013 15:38:57 +0000 (UTC) Received: (qmail 5401 invoked by uid 500); 26 Jan 2013 15:38:52 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 4537 invoked by uid 500); 26 Jan 2013 15:38:48 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 4510 invoked by uid 99); 26 Jan 2013 15:38:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 26 Jan 2013 15:38:47 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of linlma@gmail.com designates 209.85.128.171 as permitted sender) Received: from [209.85.128.171] (HELO mail-ve0-f171.google.com) (209.85.128.171) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 26 Jan 2013 15:38:40 +0000 Received: by mail-ve0-f171.google.com with SMTP id 14so653490vea.30 for ; Sat, 26 Jan 2013 07:38:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=VhfLNEQSkFDCjZG7ije2Tivku3K/uC5v/RjOdAVbeYs=; b=ZcZRm+PkHyxdJBAI6EsZ5kwL0cAsefQzyDgmWbhhVp4FYkswjhiTosSuXMOAZXO3HN AB2xjKU/eFlFiRv9EJqb0rTa8LGtf7T69HrweD+y8EhQ4idHT+V0K1CwgelSHE7g1cQ9 8FGH03WVD8iZIsVhVjtkVS894hab6qLeOSGQzVxWWLLoWFkHJ9a9ktifumej0dUr+sn9 vbbVUNkYbsoeN0GWvP//6obR3iADvdU7RCe+gUHK9zacft+ETFu6PL9rMrChxzqhSHW9 O7SnU5Om0S4SeJUxqB0wYhRHQdB7vjQz10FrJ8CgjNjX8PWYXorBLATisJU7ellrZe2N TT4Q== MIME-Version: 1.0 X-Received: by 10.52.89.48 with SMTP id bl16mr8412336vdb.120.1359214699549; Sat, 26 Jan 2013 07:38:19 -0800 (PST) Received: by 10.58.156.39 with HTTP; Sat, 26 Jan 2013 07:38:19 -0800 (PST) In-Reply-To: <01756F0A-EEF7-46DA-8FFA-821443FEFCF5@gmail.com> References: <13FCE8DF-8DE3-41F6-85BF-D4F8B6D5DD6E@gmail.com> <01756F0A-EEF7-46DA-8FFA-821443FEFCF5@gmail.com> Date: Sat, 26 Jan 2013 23:38:19 +0800 Message-ID: Subject: Re: Fair Scheduler of Hadoop From: Lin Ma To: Joep Rottinghuis Cc: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=20cf307d03ea57149404d432d5fd X-Virus-Checked: Checked by ClamAV on apache.org --20cf307d03ea57149404d432d5fd Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Thanks Joep, smart answer! All of my confusions are gone. Have a good weekend. regards, Lin On Tue, Jan 22, 2013 at 2:00 AM, Joep Rottinghuis w= rote: > You could configure it like that if you wanted. Keep in mind that would > waste some resources. Imagine a 10 minute task that has been running for = 9 > minutes. If you have that task killed immediately then it would have to b= e > the-scheduled and re-do all 10 minutes. > Give it another minute and the task is complete and out if the way. > > So, consider how busy your cluster is overall and how much you are willin= g > to wait for fairness trading this off against a certain amount of waste. > > Cheers, > > Joep > > Sent from my iPhone > > On Jan 21, 2013, at 9:30 AM, Lin Ma wrote: > > Hi Joep, > > Excellent answer! I think you have answered my confusions. And one > remaining issue after reading this document again, even it is old. :-) > > It is mentioned, "which will allow you to set how long each pool will > wait before preempting other jobs=92 tasks to reach its guaranteed capaci= ty", > my question is why each pool need wait here? If a pool cannot get its > guaranteed capacity because of jobs in other pools over use the capacity, > we should kill such jobs immediately? Appreciate if you could elaborate a > bit more why we need wait to get even guaranteed capacity. > > regards, > Lin > > On Mon, Jan 21, 2013 at 8:24 AM, Joep Rottinghuis wrote: > >> Lin, >> >> The article you are reading us old. >> Fair scheduler does have preemption. >> Tasks get killed and rerun later, potentially on a different node. >> >> You can set a minimum / guaranteed capacity. The sum of those across >> pools would typically equal the total capacity of your cluster or less. >> Then you can configure each pool to go beyond that capacity. That would >> happen if the cluster is temporary not used to the full capacity. >> Then when the demand for capacity increases, and jobs are queued in othe= r >> pools that are not running at their minimum guaranteed capacity, some lo= ng >> running tasks from jobs in the pool that is using more than its minimum >> capacity get killed (to be run later again). >> >> Does that make sense? >> >> Cheers, >> >> Joep >> >> Sent from my iPhone >> >> On Jan 20, 2013, at 6:25 AM, Lin Ma wrote: >> >> Hi guys, >> >> I have a quick question regarding to fire scheduler of Hadoop, I am >> reading this article =3D> >> http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my >> question is from the following statements, "There is currently no >> support for preemption of long tasks, but this is being added in >> HADOOP-4665 , which >> will allow you to set how long each pool will wait before preempting oth= er >> jobs=92 tasks to reach its guaranteed capacity.". >> >> My questions are, >> >> 1. What means "preemption of long tasks"? Kill long running tasks, or >> pause long running tasks to give resources to other tasks, or it means >> something else? >> 2. I am also confused about "set how long each pool will wait before >> preempting other jobs=92 tasks to reach its guaranteed capacity"., what >> means "reach its guaranteed capacity"? I think when using fair >> scheduler, each pool has predefined resources allocation settings (and t= he >> settings guarantees each pool has resources as configured), is that true= ? >> In what situations each pool will not have its guaranteed (or configured= ) >> capacity? >> >> regards, >> Lin >> >> > --20cf307d03ea57149404d432d5fd Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Thanks Joep, smart answer! All of my confusions are gone. Have a good weeke= nd.

regards,
Lin

On Tue, Jan 22= , 2013 at 2:00 AM, Joep Rottinghuis <jrottinghuis@gmail.com> wrote:
You could configure i= t like that if you wanted. Keep in mind that would waste some resources. Im= agine a 10 minute task that has been running for 9 minutes. If you have tha= t task killed immediately then it would have to be the-scheduled and re-do = all 10 minutes.
Give it another minute and the task is complete and out if the way.

So, consider how busy your cluster is overall and ho= w much you are willing to wait for fairness trading this off against a cert= ain amount of waste.

Cheers,

Joep=

Sent from my iPhone
<= br>On Jan 21, 2013, at 9:30 AM, Lin Ma <linlma@gmail.com> wrote:

Hi Joep,

Excellent answer! = I think you have answered my confusions. And one remaining issue after read= ing this document again, even it is old. :-)

It is mentioned, "= which will allow you to set how long ea= ch pool will wait before preempting other jobs=92 tasks to reach its guaran= teed capacity", my question is why each pool need wait here? If= a pool cannot get its guaranteed capacity because of jobs in other pools o= ver use the capacity, we should kill such jobs immediately? Appreciate if y= ou could elaborate a bit more why we need wait to get even guaranteed capac= ity.

regards,
Lin

On Mon, Jan 21, 2013 = at 8:24 AM, Joep Rottinghuis <jrottinghuis@gmail.com> w= rote:
Lin,

The article you are re= ading us old.
Fair scheduler does have preemption.
Task= s get killed and rerun later, potentially on a different node.

You can set a minimum / guaranteed capacity. The sum of thos= e across pools would typically equal the total capacity of your cluster or = less.
Then you can configure each pool to go beyond that capacity= . That would happen if the cluster is temporary not used to the full capaci= ty.
Then when the demand for capacity increases, and jobs are queued in ot= her pools that are not running at their minimum guaranteed capacity, some l= ong running tasks from jobs in the pool that is using more than its minimum= capacity get killed (to be run later again).

Does that make sense?

Cheers,<= /div>

Joep

Sent from my iPhone

On Jan 20, 2013, at 6:25 AM, Lin Ma <linlma@gmail.com> wrote:

Hi guys,

I have a quick que= stion regarding to fire scheduler of Hadoop, I am reading this article =3D&= gt; http://blog.cloudera.com/blog/2008/11/job-scheduling= -in-hadoop/, my question is from the following statements, "There is currently no support for preemption = of long tasks, but this is being added in=A0HADOOP-4665, which will= allow you to set how long each pool will wait before preempting other jobs= =92 tasks to reach its guaranteed capacity.".

My questions are,

1. What means "preemption of long tasks"? Kill long running tasks, o= r pause long running tasks to give resources to other tasks, or it means so= mething else?
2. I am also confused about "set h= ow long each pool will wait before preempting other jobs=92 tasks to reach = its guaranteed capacity"., what means "reach its guaranteed capacity"? I think when u= sing fair scheduler, each pool has predefined resources allocation settings= (and the settings guarantees each pool has resources as configured), is th= at true? In what situations each pool will not have its guaranteed (or conf= igured) capacity?

regards,
Lin


--20cf307d03ea57149404d432d5fd--