Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 86615 invoked from network); 24 Dec 2008 10:02:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 24 Dec 2008 10:02:30 -0000 Received: (qmail 72284 invoked by uid 500); 24 Dec 2008 10:02:24 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 72240 invoked by uid 500); 24 Dec 2008 10:02:24 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 72229 invoked by uid 99); 24 Dec 2008 10:02:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Dec 2008 02:02:24 -0800 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [203.99.254.143] (HELO rsmtp1.corp.hki.yahoo.com) (203.99.254.143) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Dec 2008 10:02:15 +0000 Received: from [10.66.92.201] (fewbugocean-lm.eglbp.corp.yahoo.com [10.66.92.201]) by rsmtp1.corp.hki.yahoo.com (8.13.8/8.13.8/y.rout) with ESMTP id mBOA1dSk084256 for ; Wed, 24 Dec 2008 02:01:41 -0800 (PST) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=user-agent:date:subject:from:to:message-id:thread-topic: thread-index:in-reply-to:mime-version:content-type:content-transfer-encoding; b=IRJ4s8MMJRkWWkmHCLRxg0z0AWStKczN9WEjruy86sANcftua2D1i41Y35hNZB2Z User-Agent: Microsoft-Entourage/12.14.0.081024 Date: Wed, 24 Dec 2008 15:31:38 +0530 Subject: Re: How to coordinate nodes of different computing powers in a same cluster? From: Devaraj Das To: Message-ID: Thread-Topic: How to coordinate nodes of different computing powers in a same cluster? Thread-Index: AcllrpxuVXFzyC30UkW1u7mJFZ9juw== In-Reply-To: Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org On 12/24/08 3:20 PM, "Aaron Kimball" wrote: > Jeremy, > > A clarification: there is currently no mechanism in Hadoop to slot > particular tasks on particular nodes. Hadoop does not take into account a > particular node's suitability for a given task; if one node has more CPU, > and another node has more IO, you cannot indicate that certain tasks should > be done on the CPU-intense nodes, and others on the IO-intense nodes. > > Speculative execution, though, means that any tasks which are "left behind" > near the end of a job will be re-executed in parallel on multiple other > "empty" nodes which are waiting for the full job to complete. Hopefully, > it'll also pick a "correct" node for the task via this secondary random > placement, if it didn't do it in the first apportioning of jobs. By default, > I think map task speculation is enabled, but reduce task speculation is > disabled. > By default, speculative execution is enabled for both. But yes, the current implementation of speculative execution has some shortcomings that https://issues.apache.org/jira/browse/HADOOP-2141 is trying to address (and that also includes the case of trying to avoid scheduling speculative tasks on slow machines). The other thing to note is that faster machines will execute more tasks than the slower machines when there are lots of tasks to execute, since machines pull tasks from the JobTracker when they are done running the current tasks. > - Aaron > > On Wed, Dec 24, 2008 at 1:12 AM, Devaraj Das wrote: > >> You can enable speculative execution for your jobs. >> >> >> On 12/24/08 10:25 AM, "Jeremy Chow" wrote: >> >>> Hi list, >>> I've come up against a scenario like this, to finish a same task, one of >> my >>> hadoop cluster only needs 5 seconds, and another one needs more than 2 >>> minutes. >>> It's a common phenomenon that will decrease the parallelism of our system >>> due to the faster one will wait the slower one. How to coordinate those >>> nodes of different computing powers in a same cluster? >>> >>> Thanks, >>> Jeremy >> >> >>