Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 80034 invoked from network); 5 Jun 2009 18:44:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 5 Jun 2009 18:44:49 -0000 Received: (qmail 80581 invoked by uid 500); 5 Jun 2009 18:44:58 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 80498 invoked by uid 500); 5 Jun 2009 18:44:58 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 80488 invoked by uid 99); 5 Jun 2009 18:44:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Jun 2009 18:44:58 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [216.145.54.171] (HELO mrout1.yahoo.com) (216.145.54.171) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Jun 2009 18:44:48 +0000 Received: from [192.168.0.198] (snvvpn1-10-73-153-c60.hq.corp.yahoo.com [10.73.153.60]) by mrout1.yahoo.com (8.13.6/8.13.6/y.out) with ESMTP id n55Ii5s8084207 for ; Fri, 5 Jun 2009 11:44:05 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=message-id:from:to:in-reply-to:content-type: content-transfer-encoding:mime-version:subject:date:references:x-mailer; b=Txm3fJOXz5syUMFbcqIke8PIUZ3OY+77eUXgTWvqBCMySCyR0wOp1lcFP56HT9/0 Message-Id: From: Alan Gates To: core-user@hadoop.apache.org In-Reply-To: <4A295B03.2090205@cs.washington.edu> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Subject: Re: Hadoop scheduling question Date: Fri, 5 Jun 2009 11:44:05 -0700 References: <4A295B03.2090205@cs.washington.edu> X-Mailer: Apple Mail (2.935.3) X-Virus-Checked: Checked by ClamAV on apache.org To add a little context, Pig uses Hadoop's JobControl to schedule it's jobs. Pig defines the dependencies between jobs in JobControl, and then submits the entire graph of jobs. So, using JobControl, does Hadoop schedule jobs serially or in parallel (assuming no dependencies)? Alan. On Jun 5, 2009, at 10:50 AM, Kristi Morton wrote: > Hi Pankil, > > Sorry about having to send my question email twice to the list... > the first time I sent it I had forgotten to subscribe to the list. > I resent it after subscribing, and your response to the first email > I sent did not make it into my inbox. I saw your response on the > archives list. > > So, to recap, you said: > > "We are not able to carry out all joins in a single job..we also > tried our hadoop code using > Pig scripts and found that for each join in PIG script new job is > used.So > basically what i think its a sequential process to handle typesof > join where > output of one job is required s an input to other one." > > > I, too, have seen this sequential behavior with joins. However, it > seems like it could be possible for there to be two jobs executing > in parallel whose output is the input to the subsequent job. Is > this possible or are all jobs scheduled sequentially? > > Thanks, > Kristi >