Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 68124 invoked from network); 23 May 2008 02:29:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 23 May 2008 02:29:38 -0000 Received: (qmail 23018 invoked by uid 500); 23 May 2008 02:29:37 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 22980 invoked by uid 500); 23 May 2008 02:29:37 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 22969 invoked by uid 99); 23 May 2008 02:29:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 May 2008 19:29:36 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [206.169.1.36] (HELO mbx1.veoh) (206.169.1.36) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 May 2008 02:28:50 +0000 Received: from 192.168.1.181 ([192.168.1.181]) by mbx1.veoh ([192.168.1.5]) with Microsoft Exchange Server HTTP-DAV ; Fri, 23 May 2008 02:29:05 +0000 User-Agent: Microsoft-Entourage/11.3.3.061214 Date: Thu, 22 May 2008 19:16:00 -0700 Subject: Re: Can you run multiple simultaneous hadoop jobs? From: Ted Dunning To: Message-ID: Thread-Topic: Can you run multiple simultaneous hadoop jobs? Thread-Index: Aci8evFGL/o+1ihuEd2/wwAWy8rVfQ== In-Reply-To: <731772.78068.qm@web38608.mail.mud.yahoo.com> Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org I think that there is an conf parameter that sets the maximum of map invocations that will be used by your program. Failing that, you can always set the number of splits to a small number, but that is less likely to balance computation well. Better to have a significantly larger number of splits than map nodes. On 5/22/08 6:58 PM, "Kayla Jay" wrote: > By that, do you mean setting the # of mappers? > > > ----- Original Message ---- > From: Ted Dunning > To: core-user@hadoop.apache.org > Sent: Thursday, May 22, 2008 5:19:32 PM > Subject: Re: Can you run multiple simultaneous hadoop jobs? > > > You definitely can run more than one job on a hadoop cluster. But if one of > the jobs asks to use all of the map or reduce nodes, then the other job will > have to wait for some of the nodes to free up before proceeding. > > Try limiting the number of map nodes and see how that changes matters. > > > On 5/22/08 1:46 PM, "Kayla Jay" wrote: > >> >> Hello. >> >> I'm trying to figure out why I need to use HOD vs. trying to run multiple >> jobs >> at the same time on the same set of resources. Is it possible to run >> multiple >> hadoop jobs at the same time on the same set of input data? I tried to run >> different jobs on the same set of data at the same time, but it takes a while >> (way while) and almost appears as if it queues up and the next job has to >> wait >> and so forth before completing. >> >> So, I tried moving onto HOD. It's not very apparent why one would want to >> use >> HOD to run on different nodes at the same time for different jobs that access >> the same set of input data. >> >> Can anyone provide any feedback on running multiple jobs at the same time on >> the same set of data? HOD use? Why would I have to run HOD and schedule >> running multiple jobs at the same time on the same set of data, but within >> their own set of resources in the cluster? >> >> Thanks >> >> >> > > >