Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 38330 invoked from network); 18 Jul 2006 17:29:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 18 Jul 2006 17:29:30 -0000 Received: (qmail 69886 invoked by uid 500); 18 Jul 2006 17:29:30 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 69729 invoked by uid 500); 18 Jul 2006 17:29:29 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 69720 invoked by uid 99); 18 Jul 2006 17:29:29 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Jul 2006 10:29:29 -0700 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_WHOIS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [216.109.112.27] (HELO mrout1-b.corp.dcn.yahoo.com) (216.109.112.27) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Jul 2006 10:29:26 -0700 Received: from memberfamouslx (memberfamous-lx.corp.yahoo.com [172.21.103.73]) by mrout1-b.corp.dcn.yahoo.com (8.13.6/8.13.6/y.out) with ESMTP id k6IHSt2c031026 for ; Tue, 18 Jul 2006 10:28:55 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=from:to:subject:date:message-id:mime-version:content-type: content-transfer-encoding:x-mailer:in-reply-to:x-mimeole:thread-index; b=EJMxi61/yg7QTXv+GYEJ8+kfysGly5v1rs57BRlFFvWhvP29C8WBdrkgTCtOt5nG From: "Yoram Arnon" To: Subject: RE: [jira] Commented: (HADOOP-7) MapReduce has a series of problems concerning task-allocation to worker nodes Date: Tue, 18 Jul 2006 10:28:55 -0700 Message-ID: <00af01c6aa8f$a55e1710$496715ac@ds.corp.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 In-Reply-To: <17317845.1153223054812.JavaMail.jira@brutus> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2869 Thread-Index: AcaqX83rfG0iNGg1RyyHbM6/0yV3NAAL1T8w X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N It works for me. Reduce tasks start executing from the get-go, but they just sit there waiting for map output to become available. And then, all they can do is copy it over, waiting for the last map to complete in order to start their sort-merge-process, so you can't expect more than 25% of recude to be compete before map is 100% complete, and often you get less. But I do see some progress from the start. Do you have mapred.tasktracker.tasks.maximum configured to more than 1? I use the value 2, creating 2 map tasks and 2 reduce tasks per node. Yoram > -----Original Message----- > From: Mikkel Kamstrup Erlandsen (JIRA) [mailto:jira@apache.org] > Sent: Tuesday, July 18, 2006 4:44 AM > To: hadoop-dev@lucene.apache.org > Subject: [jira] Commented: (HADOOP-7) MapReduce has a series > of problems concerning task-allocation to worker nodes > > [ > http://issues.apache.org/jira/browse/HADOOP-7?page=comments#ac > tion_12421847 ] > > Mikkel Kamstrup Erlandsen commented on HADOOP-7: > ------------------------------------------------ > > This is likely me being dumb, but I don't think this issue is fixed. > > When I run any of the provided example programs > wordcount/grep (also pi with specualtive excecution enabled) > reduce tasks does not start before all map tasks have completed. > > My cluster contains three nodes and I am running Hadoop 0.4.0. > > > MapReduce has a series of problems concerning > task-allocation to worker nodes > > > -------------------------------------------------------------- > --------------- > > > > Key: HADOOP-7 > > URL: http://issues.apache.org/jira/browse/HADOOP-7 > > Project: Hadoop > > Issue Type: Bug > > Environment: All > > Reporter: Mike Cafarella > > Fix For: 0.1.0 > > > > Attachments: jobtracker.patch > > > > > > The MapReduce JobTracker is not great at allocating tasks > to TaskTracker worker nodes. > > Here are the problems: > > 1) There is no speculative execution of tasks > > 2) Reduce tasks must wait until all map tasks are completed > before doing any work > > 3) TaskTrackers don't distinguish between Map and Reduce > jobs. Also, the number of > > tasks at a single node is limited to some constant. That > means you can get weird deadlock > > problems upon machine failure. The reduces take up all the > available execution slots, but they > > don't do productive work, because they're waiting for a map > task to complete. Of course, that > > map task won't even be started until the reduce tasks > finish, so you can see the problem... > > 4) The JobTracker is so complicated that it's hard to fix > any of these. > > The right solution is a rewrite of the JobTracker to be a > lot more flexible in task handling. > > It has to be a lot simpler. One way to make it simpler is > to add an abstraction I'll call > > "TaskInProgress". Jobs are broken into chunks called > TasksInProgress. All the TaskInProgress > > objects must be complete, somehow, before the Job is complete. > > A single TaskInProgress can be executed by one or more > Tasks. TaskTrackers are assigned Tasks. > > If a Task fails, we report it back to the JobTracker, where > the TaskInProgress lives. The TIP can then > > decide whether to launch additional Tasks or not. > > Speculative execution is handled within the TIP. It simply > launches multiple Tasks in parallel. The > > TaskTrackers have no idea that these Tasks are actually > doing the same chunk of work. The TIP > > is complete when any one of its Tasks are complete. > > -- > This message is automatically generated by JIRA. > - > If you think it was sent incorrectly contact one of the > administrators: > http://issues.apache.org/jira/secure/Administrators.jspa > - > For more information on JIRA, see: > http://www.atlassian.com/software/jira > > > >