Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 75251 invoked from network); 13 Aug 2009 18:33:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Aug 2009 18:33:14 -0000 Received: (qmail 64337 invoked by uid 500); 13 Aug 2009 18:33:18 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 64256 invoked by uid 500); 13 Aug 2009 18:33:18 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 64246 invoked by uid 99); 13 Aug 2009 18:33:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Aug 2009 18:33:18 +0000 X-ASF-Spam-Status: No, hits=3.4 required=10.0 tests=HTML_MESSAGE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: 209.85.211.172 is neither permitted nor denied by domain of mnagendr@asu.edu) Received: from [209.85.211.172] (HELO mail-yw0-f172.google.com) (209.85.211.172) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Aug 2009 18:33:09 +0000 Received: by ywh2 with SMTP id 2so1337309ywh.2 for ; Thu, 13 Aug 2009 11:32:48 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.11.14 with SMTP id 14mr1195269ank.81.1250188368498; Thu, 13 Aug 2009 11:32:48 -0700 (PDT) In-Reply-To: <45f85f70908131052v6d6d547dtb291eacb8b42efa1@mail.gmail.com> References: <77f4f8890908131044u38eed342o40fadf0050fe56b4@mail.gmail.com> <45f85f70908131052v6d6d547dtb291eacb8b42efa1@mail.gmail.com> Date: Thu, 13 Aug 2009 11:32:48 -0700 Message-ID: <77f4f8890908131132w77d93e25i494ccf55ff996224@mail.gmail.com> Subject: Re: Intermediary Data on Fair Scheduler From: Mithila Nagendra To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e642dd649b51b504710a29c7 X-Virus-Checked: Checked by ClamAV on apache.org --0016e642dd649b51b504710a29c7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Hi Todd So does this mean that when two jobs are assigned to a pool, where one job has 1 map task and 1 reduce task, whereas the other has 5 map and 5 reduce tasks, how will the switch between these jobs take place? Lets say the scheduler starts with the bigger job, runs 1 map task, when it switches to the shorter job what does it do with the intermediate data? for instance in Hadoop on demand if we run a search query where would the search keywords be stored? I assume if the bigger job is in middle of a map task the smaller job will wait for the task to end before the the map task for the shorter job is launched. Thanks! Mithila On Thu, Aug 13, 2009 at 10:52 AM, Todd Lipcon wrote: > Hi Mithila, > > I assume you're referring to fair scheduler preemption. In the preemption > scenario, tasks are completely killed, not paused. It's not like a > preemptive scheduler in your OS where things are "context switched". This > is > why the preemption is not enabled by default and has tuning parameters that > only trigger preemption in certain situations. > > Hope that helps, > -Todd > > On Thu, Aug 13, 2009 at 10:44 AM, Mithila Nagendra > wrote: > > > Hello All > > > > When the fair scheduler switches between two jobs, what does it do with > the > > intermediary data? Does it dump the data/job states onto the disk (DFS)? > Or > > does it do a context switch (i.e. everything is in memory)? I was looking > > at > > the scheduler for an application I'm working on, any pointers will be > > appreciated! > > > > Thanks! > > Mithila Nagendra > > Arizona State University > > > --0016e642dd649b51b504710a29c7--